Fumiya Okubo, Takayoshi Yamashita, Atsushi Shimada, Hiroaki Ogata - A Neural Network Approach for Students' Performance Prediction

Fumiya Okubo	Takayoshi Yamashita	Atsushi Shimada	Hiroaki Ogata
Faculty of Arts and Science, Kyushu University	Department of Computer Science, Chubu University	Faculty of Arts and Science, Kyushu University	Faculty of Arts and Science, Kyushu University
744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan	1200, Matsumoto-cho, Kasugai-shi, Aichi 487-8501, Japan	744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan	744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan
fokubo@artsci.kyushu-u.ac.jp	yamashita@cs.chubu.ac.jp	atsushi@limu.ait.kyushu-u.ac.jp	hiroaki.ogata@gmail.com

ABSTRACT

In this paper, we propose a method for predicting final grades of students by a Recurrent Neural Network (RNN) from the log data stored in the educational systems. We applied this method to the log data from 108 students and examined the accuracy of prediction. From the experimental results, comparing with multiple regression analysis, it is confirmed that an RNN is effective to early prediction of final grades.

CCS Concepts

Keywords

1. INTRODUCTION

In recent years, the use of ICT based educational systems has been widely spread. These systems enable us to collect many types of log data that corresponds to learning activities of students. By analyzing these logs using data mining techniques, we can determine learning patterns of students, which helps teachers in detecting “at-risk” students ([1]). At Kyushu University, a learning support system called the M2B system was introduced in October 2014. The M2B system consists of three subsystems, the e-learning system Moodle, the e-portfolio system Mahara, and the e-book system BookLooper provided by Kyocera Maruzen, Inc. Using the logs of these systems, a number of investigations have been conducted ([3], [4], [5]).

An early prediction of students' final grades is an important task in the field of learning analytics, e.g., investigated in [6] using regression analysis. In this paper, we propose a method for predicting students' final grades by a neural network approach, using the log data of the M2B system. Particularly, in order to treat time series data of each week in a course, we use a variant of a neural network, called a Recurrent Neural Network (RNN) ([2]). By comparing our results with the result obtained using regression analysis, we show the performance of prediction of students’ final grades using RNN.

2. DATA COLLECTION

We collected the learning logs of 108 students attending Information Science course, which started in April 2016. In this course, the teacher and students used the LMS, the e-portfolio system and the e-book system. The students were required each week to submit a report, to answer a quiz, to write a logbook of a lecture, and to read slides for preview and review using the three systems. The logs of these learning activities were automatically graded by the system based on the criteria shown in Table 1:

Table 1 – Criteria for grading learning activities

Activities	5	4	3	2	1	0
Attendance	Attendance		Beinglate			absence
Quiz (rate of collect answer)	Above 80%	Above 60%	Above 40%	Above 20%	Above 10%	o.w.
Report	Submission		Late submission			No submission
Course views	Upper 10%	Upper 20%	Upper 30%	Upper 40%	Upper 50%	o.w.
Slide views in Booklooper	Upper 10%	Upper 20%	Upper 30%	Upper 40%	Upper 50%	o.w.
Markers in Booklooper	Upper 10%	Upper 20%	Upper 30%	Upper 40%	Upper 50%	o.w.
Memos in Booklooper	Upper 10%	Upper 20%	Upper 30%	Upper 40%	Upper 50%	o.w.
Actions in Booklooper	Upper 10%	Upper 20%	Upper 30%	Upper 40%	Upper 50%	o.w.
Word count in Mahara	Upper 10%	Upper 20%	Upper 30%	Upper 40%	Upper 50%	o.w.

3. RECURRENT NEURAL NETWORK

A Recurrent Neural Network (RNN) handles time series data. Unlike a general Neural Network, an RNN has a recursive loop as shown in Figure 1. The RNN propagates the internal information previous time at the current time, and obtains the output value based on the information of the current time and the past information. Thus, it is possible to output in consideration of the past state.

The parameters of the RNN are trained by Backpropagation through time (BPTT). The BPTT propagates the error between the ground truth and the output at time t tracing back to time t-1. Similarly, an error at time t-1 is propagated at time t-2, and training is performed retroactively. Although the RNN theoretically can output with consideration of all the past information, in fact, the error is not able to propagate to far past. Therefore, it is an output considering only the information of the past several times.

To address this problem, Long Short Term Memory (LSTM) is employed as a unit in middle layer that stores long term information. The LSTM has a memory for storing the internal state. The memory information stored in LSTM is kept efficient information or deleted discard information by input or internal state at previous time. In this paper, LSTM is used as a middle layer unit of RNN.

4. RESULTS

For each 1 <= i <= 15, the RNN was trained by the log data until the i-th week, that consisted of a vector of nine kinds of grades of each week shown in Table 1 (treated as input), and the final grade A, B, C, D, or F of students (treated as output). Using the obtained RNN, the prediction of the final grades of students was performed.

We also examined the prediction of final grade using multiple regression analysis, where the final grades A, B, C, D, and F were replaced by 95, 85, 75, 65, and 55, respectively. For each 1 <= i <= 15, the multiple regression analysis using the data until the i-th week was performed. Moreover, by using the multiple regression equation, the final grades were predicted.

The accuracy of prediction by the RNN and the accuracy of prediction by the multiple regression analysis together with the value of adjusted R² are summarized in Table 2. We can observe that the accuracy by the RNN is above 90% using the log data until 6^th week, while the accuracy by the multiple regression analysis is less than 90% using the log data until 9^th week. Hence, it can be said that RNN is effective for early prediction of final grades of students.

Table 2 – The accuracy of prediction for the final grade by the RNN, the multiple regression analysis, and adjusted R²

Weeks	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
RNN	50%	64%	73%	81%	87%	93%	93%	94%	98%	100%	100%	100%	100%	100%	99%
Multiple regression analysis	41%	46%	46%	52%	61%	63%	67%	75%	89%	92%	94%	94%	96%	100%	100%
Adjusted R²	.158	.212	.281	.325	.353	.379	.502	.620	.744	.772	.757	.758	.790	.951	.988

We remark that the learning activities that contribute to obtain a certain grade can be inferred from the weight of the obtained RNN.

5. CONCLUSION

In this paper, we proposed a method for predicting final grades of students by a Recurrent Neural Network (RNN) from the log data stored in the educational systems. The log data represented the learning activities of students who used the learning management system, the e-portfolio system, and the e-book system. We applied this method to the log data from 108 students. The accuracy of prediction by the RNN is above 90% using the log data until 6th week. This fact shows that comparing with multiple regression analysis, RNN is effective to early prediction of the final grades.

6. ACKNOWLEDGMENTS

The research results have been achieved by Research and Development on Fundamental and Utilization Technologies for Social Big Data (178A03), the Commissioned Research of National Institute of Information and Communications Technology (NICT), Japan; Grant-in-Aid for scientific Research(S) No.16H06304.

A Neural Network Approach for Students' Performance Prediction