TY - GEN
T1 - Use of conventional machine learning to optimize deep learning hyperparameters for NLP labeling tasks
AU - Gu, Yang
AU - Leroy, Gondy
N1 - Publisher Copyright:
© 2020 IEEE Computer Society. All rights reserved.
PY - 2020
Y1 - 2020
N2 - Deep learning delivers good performance in classification tasks, but is suboptimal with small and unbalanced datasets, which are common in many domains. To address this limitation, we use conventional machine learning, i.e., support vector machines (SVM) to tune deep learning hyperparameters. We evaluated our approach using mental health electronic health records in which diagnostic criteria needed to be extracted. A bidirectional Long Short-Term Memory network (BI-LSTM) could not learn the labels for the seven scarcest classes, but saw an increase in performance after training with optimal weights learned from tuning SVMs. With these customized class weights, the F1 scores for rare classes rose from 0 to values ranging from 18% to 57%. Overall, the BI-LSTM with SVM-customized class weights achieved a micro-average of 47.1% for F1 across all classes, an improvement over the regular BI-LSTM's 45.9%. The main contribution lies in avoiding null performance for rare classes.
AB - Deep learning delivers good performance in classification tasks, but is suboptimal with small and unbalanced datasets, which are common in many domains. To address this limitation, we use conventional machine learning, i.e., support vector machines (SVM) to tune deep learning hyperparameters. We evaluated our approach using mental health electronic health records in which diagnostic criteria needed to be extracted. A bidirectional Long Short-Term Memory network (BI-LSTM) could not learn the labels for the seven scarcest classes, but saw an increase in performance after training with optimal weights learned from tuning SVMs. With these customized class weights, the F1 scores for rare classes rose from 0 to values ranging from 18% to 57%. Overall, the BI-LSTM with SVM-customized class weights achieved a micro-average of 47.1% for F1 across all classes, an improvement over the regular BI-LSTM's 45.9%. The main contribution lies in avoiding null performance for rare classes.
UR - http://www.scopus.com/inward/record.url?scp=85108145528&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85108145528&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85108145528
T3 - Proceedings of the Annual Hawaii International Conference on System Sciences
SP - 1026
EP - 1035
BT - Proceedings of the 53rd Annual Hawaii International Conference on System Sciences, HICSS 2020
A2 - Bui, Tung X.
PB - IEEE Computer Society
T2 - 53rd Annual Hawaii International Conference on System Sciences, HICSS 2020
Y2 - 7 January 2020 through 10 January 2020
ER -