深度学习用于课程评论的观点挖掘和主题分类 (Deep Learning for Opinion Mining and Topic Classification of Course Reviews)

Student opinions for a course are important to educators and administrators, regardless of the type of the course or the institution. Reading and manually analyzing open-ended feedback becomes infeasible for massive volumes of comments at institution level or online forums. In this paper, we collected and pre-processed a large number of course reviews publicly available online. We applied machine learning techniques with the goal to gain insight into student sentiments and topics. Specifically, we utilized current Natural Language Processing (NLP) techniques, such as word embeddings and deep neural networks, and state-of-the-art BERT (Bidirectional Encoder Representations from Transformers), RoBERTa (Robustly optimized BERT approach) and XLNet (Generalized Auto-regression Pre-training). We performed extensive experimentation to compare these techniques versus traditional approaches. This comparative study demonstrates how to apply modern machine learning approaches for sentiment polarity extraction and topic-based classification utilizing course feedback. For sentiment polarity, the top model was RoBERTa with 95.5\% accuracy and 84.7\% F1-macro, while for topic classification, an SVM (Support Vector Machine) was the top classifier with 79.8\% accuracy and 80.6\% F1-macro. We also provided an in-depth exploration of the effect of certain hyperparameters on the model performance and discussed our observations. These findings can be used by institutions and course providers as a guide for analyzing their own course feedback using NLP models towards self-evaluation and improvement.

翻译：学生们对课程的意见无论是对于教育者还是管理者都非常重要，无论课程类型或机构如何。在机构层面或在线论坛上阅读和手动分析大量的开放式反馈变得不切实际。在本文中，我们收集和预处理了大量公开的课程评论。我们利用机器学习技术来获取有关学生情感和主题的洞察。具体而言，我们利用了当前的自然语言处理（NLP）技术，如词嵌入和深度神经网络，并使用最先进的BERT（双向编码器变换器）、RoBERTa（鲁棒优化BERT方法）和XLNet（广义自回归预训练）。我们进行了广泛的实验，以比较这些技术与传统方法的优劣。这项比较研究展示了如何应用现代机器学习方法来利用课程反馈进行情感极性提取和基于主题的分类。对于情感极性，RoBERTa模型是最佳模型，准确率达95.5％，F1-macro值为84.7％；而对于主题分类，SVM（支持向量机）是排名最高的分类器，准确率为79.8％，F1-macro值为80.6％。我们还提供了一个深入的探究某些超参数对模型性能的影响，并讨论了我们的观察结果。这些发现可供机构和课程提供者使用，作为利用NLP模型进行自我评估和改进的指南。

相关内容

课程

关注 6

课程是指学校学生所应学习的学科总和及其进程与安排。课程是对教育的目标、教学内容、教学活动方式的规划和设计，是教学计划、教学大纲等诸多方面实施过程的总和。广义的课程是指学校为实现培养目标而选择的教育内容及其进程的总和，它包括学校老师所教授的各门学科和有目的、有计划的教育活动。狭义的课程是指某一门学科。专知上对国内外最新AI+X的课程进行了收集与索引，涵盖斯坦福大学、CMU、MIT、清华、北大等名校开放课程。

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

多伦多大学Fall2020《机器学习导论》课程，不可错过！

专知会员服务

55+阅读 · 2020年10月11日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日