Millions of learners worldwide are now using intelligent tutoring systems (ITSs). At their core, ITSs rely on machine learning algorithms to track each user's changing performance level over time to provide personalized instruction. Crucially, student performance models are trained using interaction sequence data of previous learners to analyse data generated by future learners. This induces a cold-start problem when a new course is introduced for which no training data is available. Here, we consider transfer learning techniques as a way to provide accurate performance predictions for new courses by leveraging log data from existing courses. We study two settings: (i) In the naive transfer setting, we propose course-agnostic performance models that can be applied to any course. (ii) In the inductive transfer setting, we tune pre-trained course-agnostic performance models to new courses using small-scale target course data (e.g., collected during a pilot study). We evaluate the proposed techniques using student interaction sequence data from 5 different mathematics courses containing data from over 47,000 students in a real world large-scale ITS. The course-agnostic models that use additional features provided by human domain experts (e.g, difficulty ratings for questions in the new course) but no student interaction training data for the new course, achieve prediction accuracy on par with standard BKT and PFA models that use training data from thousands of students in the new course. In the inductive setting our transfer learning approach yields more accurate predictions than conventional performance models when only limited student interaction training data (<100 students) is available to both.
翻译:目前全世界有数百万学生正在使用智能辅导系统(ITS)。在其核心方面,ITS依靠机器学习算法来跟踪每个用户逐渐变化的业绩水平,以提供个性化教学。关键的是,学生业绩模型是利用以前学习者的互动序列数据来培训的,以分析未来学习者产生的数据。这在引入没有培训数据的新课程时会产生一个冷却的启动问题。在这里,我们认为,通过利用现有课程的日志数据,将学习技术作为为新课程提供准确的性能预测的一种方法。我们研究了两个设置:(一) 在天真的转让设置中,我们建议了可应用于任何课程的课程 -- -- 级化性能模型。(二) 在感化转移设置中,我们先行培训课程-无意识性能模型,我们用小规模目标课程数据(例如,在试点研究期间收集的数千个新课程数据)。我们用学生互动序列数据来评估拟议的技术,其中仅包含来自真实世界大规模中47 000多名学生的数据。我们研究两个设置了课程 -- -- -- 使用新的无知性性性性性性性性性性性性性化性能模型,这些模型可以应用于任何课程。 (在人类域专家提供的精确性化转换中,在学习课程中,在新数据中,没有数据中,但无法对学生进行新的数据进行新的性能学中进行新的数据分析时用新的性能数据转换数据转换数据,在新数据中,在新数据中,在新数据转换数据中,在新数据中,在新数据转换数据中,在新数据学中则用新的性能模型中进行新的数据转换数据中,在进行新的数据中,在新数据转换中,在新数据中,在PFA学中进行新的数据学中进行新的数据转换中进行新的数据转换中进行新的数据转换。