TY - GEN
T1 - Using big data for predicting freshmen retention
AU - Ram, Sudha
AU - Wang, Yun
AU - Currim, Faiz
AU - Currim, Sabah
PY - 2015
Y1 - 2015
N2 - Traditional research in student retention is survey-based, relying on data collected from questionnaires, which is not optimal for proactive prediction and real-Time decision (student intervention) support. Machine learning approaches have their own limitations. Therefore, in this research, we propose a big data approach to formulating a predictive model. We used commonly available (student demographic and academic) data in academic institutions augmented by derived implicit social networks from students' university smart card transactions. Furthermore, we applied a sequence learning method to infer students' campus integration from their purchasing behaviors. Since student retention data is highly imbalanced, we built a new ensemble classifier to predict students at-risk of dropping out. For model evaluation, we use a real-world dataset of smart card transactions from a large educational institution. The experimental results show that the addition of campus integration and social behavior features refined using the ensemble method significantly improve prediction accuracy and recall.
AB - Traditional research in student retention is survey-based, relying on data collected from questionnaires, which is not optimal for proactive prediction and real-Time decision (student intervention) support. Machine learning approaches have their own limitations. Therefore, in this research, we propose a big data approach to formulating a predictive model. We used commonly available (student demographic and academic) data in academic institutions augmented by derived implicit social networks from students' university smart card transactions. Furthermore, we applied a sequence learning method to infer students' campus integration from their purchasing behaviors. Since student retention data is highly imbalanced, we built a new ensemble classifier to predict students at-risk of dropping out. For model evaluation, we use a real-world dataset of smart card transactions from a large educational institution. The experimental results show that the addition of campus integration and social behavior features refined using the ensemble method significantly improve prediction accuracy and recall.
KW - Data mining
KW - Machine learning
KW - Predictive modeling
KW - Social Network Analysis
UR - http://www.scopus.com/inward/record.url?scp=85126583404&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85126583404&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85126583404
SN - 9780996683111
T3 - 2015 International Conference on Information Systems: Exploring the Information Frontier, ICIS 2015
BT - 2015 International Conference on Information Systems
PB - Association for Information Systems
T2 - 2015 International Conference on Information Systems: Exploring the Information Frontier, ICIS 2015
Y2 - 13 December 2015 through 16 December 2015
ER -