TY - GEN
T1 - Partially supervised learning for radical opinion identification in hate group web forums
AU - Yang, Ming
AU - Chen, Hsinchun
PY - 2012
Y1 - 2012
N2 - Web forums are frequently used as platforms for the exchange of information and opinions, as well as propaganda dissemination. But online content can be misused when the information being distributed, such as radical opinions, is unsolicited or inappropriate. However, radical opinion is highly hidden and distributed in Web forums, while non-radical content is unspecific and topically more diverse. It is costly and time consuming to label a large amount of radical content (positive examples) and non-radical content (negative examples) for training classification systems. Nevertheless, it is easy to obtain large volumes of unlabeled content in Web forums. In this paper, we propose and develop a topic-sensitive partially supervised learning approach to address the difficulties in radical opinion identification in hate group Web forums. Specifically, we design a labeling heuristic to extract high quality positive examples and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group Web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance than its counterparts.
AB - Web forums are frequently used as platforms for the exchange of information and opinions, as well as propaganda dissemination. But online content can be misused when the information being distributed, such as radical opinions, is unsolicited or inappropriate. However, radical opinion is highly hidden and distributed in Web forums, while non-radical content is unspecific and topically more diverse. It is costly and time consuming to label a large amount of radical content (positive examples) and non-radical content (negative examples) for training classification systems. Nevertheless, it is easy to obtain large volumes of unlabeled content in Web forums. In this paper, we propose and develop a topic-sensitive partially supervised learning approach to address the difficulties in radical opinion identification in hate group Web forums. Specifically, we design a labeling heuristic to extract high quality positive examples and negative examples from unlabeled datasets. The empirical evaluation results from two large hate group Web forums suggest that our proposed approach generally outperforms the benchmark techniques and exhibits more stable performance than its counterparts.
KW - Web forum
KW - document classification
KW - opinion mining
KW - partially supervised learning
UR - http://www.scopus.com/inward/record.url?scp=84867343799&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867343799&partnerID=8YFLogxK
U2 - 10.1109/ISI.2012.6284099
DO - 10.1109/ISI.2012.6284099
M3 - Conference contribution
AN - SCOPUS:84867343799
SN - 9781467321037
T3 - ISI 2012 - 2012 IEEE International Conference on Intelligence and Security Informatics: Cyberspace, Border, and Immigration Securities
SP - 96
EP - 101
BT - ISI 2012 - 2012 IEEE International Conference on Intelligence and Security Informatics
T2 - 2012 10th IEEE International Conference on Intelligence and Security Informatics, ISI 2012
Y2 - 11 June 2012 through 14 June 2012
ER -