Educational recommenders have received much less attention in comparison to e-commerce and entertainment-related recommenders, even though efficient intelligent tutors have great potential to improve learning gains. One of the main challenges in advancing this research direction is the scarcity of large, publicly available datasets. In this work, we release a large, novel dataset of learners engaging with educational videos in-the-wild. The dataset, named Personalised Educational Engagement with Knowledge Topics PEEK, is the first publicly available dataset of this nature. The video lectures have been associated with Wikipedia concepts related to the material of the lecture, thus providing a humanly intuitive taxonomy. We believe that granular learner engagement signals in unison with rich content representations will pave the way to building powerful personalization algorithms that will revolutionise educational and informational recommendation systems. Towards this goal, we 1) construct a novel dataset from a popular video lecture repository, 2) identify a set of benchmark algorithms to model engagement, and 3) run extensive experimentation on the PEEK dataset to demonstrate its value. Our experiments with the dataset show promise in building powerful informational recommender systems. The dataset and the support code is available publicly.
翻译:与电子商务和娱乐相关推荐人相比,教育推荐人得到的关注远不如电子商务和娱乐相关推荐人,尽管高效的智能导师在提高学习成果方面具有巨大潜力。推进这一研究方向的主要挑战之一是缺少大量公开可获取的数据集。在这项工作中,我们发布了大量与教育录像有关的学习者新颖的新数据集。数据集的名称是个人化教育与知识主题PEEEK,是这种性质的首个公开数据集。视频讲座与维基百科概念相关,与讲座材料相关,从而提供了一种人性直观的分类学。我们认为,颗粒学习者参与信号与丰富的内容表述一致,将为建立强大的个性化算法铺平道路,这种算法将使教育和信息建议系统发生革命。为此,我们(1) 从一个流行的视频讲座存放处建立一套新的数据集,(2) 确定一套模型参与的基准算法,(3) 对PEEEK数据集进行广泛的实验,以展示其价值。我们对数据集的实验显示建设强大信息建议系统的承诺。数据设置和支持代码是公开的。