Sentiment analysis on chinese health forums: A preliminary study of different language models

Yan Zhang, Yong Zhang, Jennifer Xu, Chunxiao Xing, Hsinchun Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations


Sentiment analysis on Chinese health forums is challenging because of the language, platform, and domain characteristics. Our research investigates the impact of three factors on sentiment analysis: sentiment polarity distribution, language models, and model settings. We manually labeled a large sample of Chinese health forum posts, which showed an extremely unbalanced distribution with a very small percentage of negative posts, and found that the balanced training set could produce higher accuracy than the unbalanced one. We also found that the hybrid approaches combining multiple language model based approaches for sentiment analysis performed better than individual approaches. Finally we evaluated the effects of different model settings and improved the overall accuracy using the hybrid approaches in their optimal settings. Findings from this preliminary study provide deeper insights into the problem of sentiment analysis on Chinese health forums and will inform future sentiment analysis studies.

Original languageEnglish (US)
Title of host publicationSmart Health - International Conference, ICSH 2015, Revised Selected Papers
EditorsHsinchun Chen, Daniel Dajun Zeng, Xiaolong Zheng, Scott J. Leischow
Number of pages14
ISBN (Print)9783319291741
StatePublished - 2016
EventInternational Conference for Smart Health, ICSH 2015 - Phoenix, United States
Duration: Nov 17 2015Nov 18 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


OtherInternational Conference for Smart Health, ICSH 2015
Country/TerritoryUnited States


  • Chinese health forum
  • Language model
  • Sentiment analysis

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Sentiment analysis on chinese health forums: A preliminary study of different language models'. Together they form a unique fingerprint.

Cite this