Predicting At-Risk Students' Academic Performance in an Online Learning Environment Using Learning Management System Interaction Data and the Random Forest Algorithm
Abstract
Purpose – This study aims to develop a model to identify at-risk students from LMS interaction data, analyzing how existing machine learning models can improve this identification.
Method – A machine learning model was created using five classifiers: random forest (RF), support vector machine, Naive Bayes, logistic regression, and K-nearest neighbor, to predict student performance from LMS interactions in a dataset of 486 students from a local university.
Results – The Random Forest algorithm achieved an MCC score of 66.42%, a Kappa score of 64.94%, and an F1 score of 66.62%.
Conclusion – LMS has enhanced education by improving accessibility and centralizing information, but challenges remain in identifying at-risk students. ML models like Random Forest show promise in addressing this issue.
Recommendations – Use more reliable datasets, explore imbalance treatment techniques, and integrate Random Forest predictive modeling to identify at-risk students in LMS.
Research Implications – This research seeks to promote the use of robust methods for improving predictive modeling accuracy using Random Forest to identify at-risk students.
Practical Implications – This research provides insights on predictive modeling using Random Forest and student interaction data from LMS to enable timely interventions and improve student success and learning outcomes.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.