Online learning platforms provide learning materials and answers to students' academic questions by experts, peers, or systems. This paper explores question-type identification as a step in content understanding for an online learning platform. The aim of the question-type identifier is to categorize question types based on their structure and complexity, using the question text, subject, and structural features. We have defined twelve question-type classes, including Multiple-Choice Question (MCQ), essay, and others. We have compiled an internal dataset of students' questions and used a combination of weak-supervision techniques and manual annotation. We then trained a BERT-based ensemble model on this dataset and evaluated this model on a separate human-labeled test set. Our experiments yielded an F1-score of 0.94 for MCQ binary classification and promising results for 12-class multilabel classification. We deployed the model in our online learning platform as a crucial enabler for content understanding to enhance the student learning experience.
翻译:在线学习平台提供学习材料,由专家、同行或系统回答学生的学术问题。本文探讨问题类型识别,作为在线学习平台内容理解的一个步骤。问题类型识别符号的目的是利用问题文本、主题和结构特征,根据问题类型的结构和复杂性,对问题类型进行分类。我们已经界定了12个问题类型类别,包括多选择问题(MCQ)、论文等。我们汇编了学生问题内部数据集,并结合了弱监督技术和人工批注。我们随后在这个数据集上培训了一个基于BERT的合用模型,并在一个单独的人类标签测试组上评价了这一模型。我们的实验为MCQ二进制分类和12级多标签分类的有希望的结果产生了一个0.94的F1芯。我们把模型放在我们的在线学习平台上,作为了解内容的关键促进器,以加强学生学习经验。