TY - GEN
T1 - Detecting Diabetes Risk from Social Media Activity
AU - Bell, Dane
AU - Laparra, Egoitz
AU - Kousik, Aditya
AU - Ishihara, Terron
AU - Surdeanu, Mihai
AU - Kobourov, Stephen
N1 - Publisher Copyright:
© 2018 Association for Computational Linguistics.
PY - 2018
Y1 - 2018
N2 - This work explores the detection of individuals' risk of type 2 diabetes mellitus (T2DM) directly from their social media (Twitter) activity. Our approach extends a deep learning architecture with several contributions: following previous observations that language use differs by gender, it captures and uses gender information through domain adaptation; it captures recency of posts under the hypothesis that more recent posts are more representative of an individual's current risk status; and, lastly, it demonstrates that in this scenario where activity factors are sparsely represented in the data, a bag-of-word neural network model using custom dictionaries of food and activity words performs better than other neural sequence models. Our best model, which incorporates all these contributions, achieves a risk-detection F1 of 41.9, considerably higher than the baseline rate (36.9).
AB - This work explores the detection of individuals' risk of type 2 diabetes mellitus (T2DM) directly from their social media (Twitter) activity. Our approach extends a deep learning architecture with several contributions: following previous observations that language use differs by gender, it captures and uses gender information through domain adaptation; it captures recency of posts under the hypothesis that more recent posts are more representative of an individual's current risk status; and, lastly, it demonstrates that in this scenario where activity factors are sparsely represented in the data, a bag-of-word neural network model using custom dictionaries of food and activity words performs better than other neural sequence models. Our best model, which incorporates all these contributions, achieves a risk-detection F1 of 41.9, considerably higher than the baseline rate (36.9).
UR - http://www.scopus.com/inward/record.url?scp=85098433353&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85098433353&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85098433353
T3 - EMNLP 2018 - 9th International Workshop on Health Text Mining and Information Analysis, LOUHI 2018 - Proceedings of the Workshop
SP - 1
EP - 11
BT - EMNLP 2018 - 9th International Workshop on Health Text Mining and Information Analysis, LOUHI 2018 - Proceedings of the Workshop
PB - Association for Computational Linguistics (ACL)
T2 - 9th International Workshop on Health Text Mining and Information Analysis, LOUHI 2018, co-located with EMNLP 2018
Y2 - 31 October 2018
ER -