Prediction of Learning Outcomes With a Machine Learning Algorithm Based On Online Learning Behavior Data In Blended Courses

Document Type


Publication Date



Computing Sciences and Computer Engineering


Learning outcomes can be predicted with machine learning algorithms that assess students’ online behavior data. However, there have been few generalized predictive models for a large number of blended courses in different disciplines and in different cohorts. In this study, we examined learning outcomes in terms of learning data in all of the blended courses offered at a Chinese university and proposed a new classification method of blended courses, in which students were primarily clustered on the basis of their online learning behaviors in blended courses using the expectation–maximization algorithm. Then, the blended courses were classified on the basis of the cluster of students who were present in the course and had the highest proportion. The advantage of this method is that the criteria used for classification of the blended courses are clearly defined on the basis of students' online behavior data, so it can easily be used by machine learning systems to algorithmically classify blended courses based on log data collected from a learning management system. Drawing on the classification of the blended courses, we also proposed and validated a general model using the random forest algorithm to predict learning outcomes based on students’ online behaviors in blended courses with different disciplines and different cohorts. The findings of this study indicated that after blended courses were classified on the basis of students’ online behavior, prediction accuracy in each category increased. The overall accuracies for Course I (380 courses out of 661 after screening), L (14 courses out of 661 after screening), A (237 courses out of 661 after screening), V (8 courses out of 661 after screening), and H (22 courses out of 661 after screening) were 38.2%, 48.4%, 42.3%, 42.4%, and 74.7%, respectively. According to these results, it was found that a prerequisite for the accurate prediction of students' learning outcomes in a blended course was that most students should be highly engaged in a variety of online learning activities rather than being focused on only one type of activity, such as only watching online videos or submitting online assignments. The prediction model achieved accuracies of 80.6%, 85.3%, 63%, 54.8%, and 14.3% for grades A, B, C, D, and F in Course H, respectively. The results demonstrated the potential of the proposed model for accurately predicting learning outcomes in blended courses. Finally, we found that there was no single online learning behavior that had a dominant effect on the prediction of students' final grades.

Publication Title

Asia Pacific Education Review

Find in your library