TY - JOUR
T1 - Evaluation of Naive Bayes and Support Vector Machines for Wikipedia
AU - Mocherla, Sridhar
AU - Danehy, Alexander
AU - Impey, Christopher
N1 - Publisher Copyright:
© 2017 Taylor & Francis.
PY - 2017/11/26
Y1 - 2017/11/26
N2 - Wikipedia has become the de facto source for information on the web, and it has experienced exponential growth since its inception. Text Classification with Wikipedia has seen limited research in the past with the goal of studying and evaluating different classification techniques. To this end, we compare and illustrate the effectiveness of two standard classifiers in the text classification literature, Naive Bayes (Multinomial) and Support Vector Machines (SVM), on the full English Wikipedia corpus for six different categories. For each category, we build training sets using subject matter experts and Wikipedia portals and then evaluate Precision/Recall values using a random sampling approach. Our results show that SVM (linear kernel) performs exceptionally across all categories, and the accuracy of Naive Bayes is inferior in some categories, whereas its generalizing capability is on par with SVM.
AB - Wikipedia has become the de facto source for information on the web, and it has experienced exponential growth since its inception. Text Classification with Wikipedia has seen limited research in the past with the goal of studying and evaluating different classification techniques. To this end, we compare and illustrate the effectiveness of two standard classifiers in the text classification literature, Naive Bayes (Multinomial) and Support Vector Machines (SVM), on the full English Wikipedia corpus for six different categories. For each category, we build training sets using subject matter experts and Wikipedia portals and then evaluate Precision/Recall values using a random sampling approach. Our results show that SVM (linear kernel) performs exceptionally across all categories, and the accuracy of Naive Bayes is inferior in some categories, whereas its generalizing capability is on par with SVM.
UR - http://www.scopus.com/inward/record.url?scp=85042209376&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85042209376&partnerID=8YFLogxK
U2 - 10.1080/08839514.2018.1440907
DO - 10.1080/08839514.2018.1440907
M3 - Article
AN - SCOPUS:85042209376
SN - 0883-9514
VL - 31
SP - 733
EP - 744
JO - Applied Artificial Intelligence
JF - Applied Artificial Intelligence
IS - 9-10
ER -