TY - GEN
T1 - Generative models for mining latent aspects and their ratings from short reviews
AU - Li, Huayu
AU - Lin, Rongcheng
AU - Hong, Richang
AU - Ge, Yong
N1 - Funding Information:
ACKNOWLEDGEMENTS: This research was supported in part by National Institutes of Health under Grant 1R21AA023975-01 and National Center for International Joint Research on E-Business Information Processing under Grant 2013B01035.
Publisher Copyright:
© 2015 IEEE.
PY - 2016/1/5
Y1 - 2016/1/5
N2 - A large number of online reviews have been accumulated on the Web, such as Amazon.com and Cnet.com. It is increasingly challenging to digest these reviews for both consumers and firms as the volume of reviews increases. A promising direction to ease such a burden is to automatically identify aspects of a product and reveal each individual's ratings on them from these reviews. The identified and rated aspects can help consumers understand the pros and cons of a product and make their purchase decisions, and help firms learn user feedbacks and improve their products and marketing strategy. While different methods have been introduced to tackle this problem in the past, few of them successfully model the intrinsic connection between aspect and aspect rating particularly in short reviews. To this end, in this paper, we first propose the Aspect Identification and Rating (AIR) model to model observed textual reviews and overall ratings in a generative way, where the sampled aspect rating influences the sampling of sentimental words on this aspect. Furthermore, we enhance AIR model to particularly address one unique characteristic of short reviews that aspects mentioned in reviews may be quite unbalanced, and develop another model namely AIRS. Within AIRS model, we allow an aspect to directly affect the sampling of a latent rating on this aspect in order to capture the mutual influence between aspect and aspect rating through the whole generative process. Finally, we examine our two models and compare them with other methods based on multiple real world data sets, including hotel reviews, beer reviews and app reviews. Experimental results clearly demonstrate the effectiveness and improvement of our models. Other potential applications driven by our results are also shown in the experiments.
AB - A large number of online reviews have been accumulated on the Web, such as Amazon.com and Cnet.com. It is increasingly challenging to digest these reviews for both consumers and firms as the volume of reviews increases. A promising direction to ease such a burden is to automatically identify aspects of a product and reveal each individual's ratings on them from these reviews. The identified and rated aspects can help consumers understand the pros and cons of a product and make their purchase decisions, and help firms learn user feedbacks and improve their products and marketing strategy. While different methods have been introduced to tackle this problem in the past, few of them successfully model the intrinsic connection between aspect and aspect rating particularly in short reviews. To this end, in this paper, we first propose the Aspect Identification and Rating (AIR) model to model observed textual reviews and overall ratings in a generative way, where the sampled aspect rating influences the sampling of sentimental words on this aspect. Furthermore, we enhance AIR model to particularly address one unique characteristic of short reviews that aspects mentioned in reviews may be quite unbalanced, and develop another model namely AIRS. Within AIRS model, we allow an aspect to directly affect the sampling of a latent rating on this aspect in order to capture the mutual influence between aspect and aspect rating through the whole generative process. Finally, we examine our two models and compare them with other methods based on multiple real world data sets, including hotel reviews, beer reviews and app reviews. Experimental results clearly demonstrate the effectiveness and improvement of our models. Other potential applications driven by our results are also shown in the experiments.
KW - Aspect Identification
KW - Rating
KW - Reviews
UR - http://www.scopus.com/inward/record.url?scp=84963579564&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84963579564&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2015.28
DO - 10.1109/ICDM.2015.28
M3 - Conference contribution
AN - SCOPUS:84963579564
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 241
EP - 250
BT - Proceedings - 15th IEEE International Conference on Data Mining, ICDM 2015
A2 - Aggarwal, Charu
A2 - Zhou, Zhi-Hua
A2 - Tuzhilin, Alexander
A2 - Xiong, Hui
A2 - Wu, Xindong
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 15th IEEE International Conference on Data Mining, ICDM 2015
Y2 - 14 November 2015 through 17 November 2015
ER -