Abstract
We develop a method to assess the quality of peer-produced content in knowledge repositories using their development and coordination histories. We also develop a process to identify relevant features for quality assessment models and algorithms for processing datasets in large-scale knowledge repositories. Models using these features, on English language Wikipedia articles, outperform existing methods for quality assessment. We achieve an overall accuracy of 81 percent which is a 7 percent improvement over existing models. In addition, our features improve the precision and recall of each class up to 9 percent and 17 percent respectively. Finally, our models are robust to ten-fold cross validation and techniques used for classification. Overall, our research provides a comprehensive design science framework for both identifying and efficiently extracting features related to development and coordination activities and assessing quality using these features. We also provide details of potential implementation of a quality assessment system for knowledge repositories.
Original language | English (US) |
---|---|
Pages (from-to) | 478-512 |
Number of pages | 35 |
Journal | Journal of Management Information Systems |
Volume | 36 |
Issue number | 2 |
DOIs | |
State | Published - Apr 3 2019 |
Keywords
- Wikipedia
- big data analytics
- design science
- knowledge repositories
- peer-produced content
- predictive analytics
ASJC Scopus subject areas
- Management Information Systems
- Computer Science Applications
- Management Science and Operations Research
- Information Systems and Management