Use of conventional machine learning to optimize deep learning hyperparameters for NLP labeling tasks

Yang Gu, Gondy Leroy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Deep learning delivers good performance in classification tasks, but is suboptimal with small and unbalanced datasets, which are common in many domains. To address this limitation, we use conventional machine learning, i.e., support vector machines (SVM) to tune deep learning hyperparameters. We evaluated our approach using mental health electronic health records in which diagnostic criteria needed to be extracted. A bidirectional Long Short-Term Memory network (BI-LSTM) could not learn the labels for the seven scarcest classes, but saw an increase in performance after training with optimal weights learned from tuning SVMs. With these customized class weights, the F1 scores for rare classes rose from 0 to values ranging from 18% to 57%. Overall, the BI-LSTM with SVM-customized class weights achieved a micro-average of 47.1% for F1 across all classes, an improvement over the regular BI-LSTM's 45.9%. The main contribution lies in avoiding null performance for rare classes.

Original languageEnglish (US)
Title of host publicationProceedings of the 53rd Annual Hawaii International Conference on System Sciences, HICSS 2020
EditorsTung X. Bui
PublisherIEEE Computer Society
Pages1026-1035
Number of pages10
ISBN (Electronic)9780998133133
StatePublished - 2020
Event53rd Annual Hawaii International Conference on System Sciences, HICSS 2020 - Maui, United States
Duration: Jan 7 2020Jan 10 2020

Publication series

NameProceedings of the Annual Hawaii International Conference on System Sciences
Volume2020-January
ISSN (Print)1530-1605

Conference

Conference53rd Annual Hawaii International Conference on System Sciences, HICSS 2020
Country/TerritoryUnited States
CityMaui
Period1/7/201/10/20

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Use of conventional machine learning to optimize deep learning hyperparameters for NLP labeling tasks'. Together they form a unique fingerprint.

Cite this