The recent pandemic has changed the way we see education. It is not surprising that children and college students are not the only ones using online education. Millions of adults have signed up for online classes and courses during last years, and MOOC providers, such as Coursera or edX, are reporting millions of new users signing up in their platforms. However, students do face some challenges when choosing courses. Though online review systems are standard among many verticals, no standardized or fully decentralized review systems exist in the MOOC ecosystem. In this vein, we believe that there is an opportunity to leverage available open MOOC reviews in order to build simpler and more transparent reviewing systems, allowing users to really identify the best courses out there. Specifically, in our research we analyze 2.4 million reviews (which is the largest MOOC reviews dataset used until now) from five different platforms in order to determine the following: (1) if the numeric ratings provide discriminant information to learners, (2) if NLP-driven sentiment analysis on textual reviews could provide valuable information to learners, (3) if we can leverage NLP-driven topic finding techniques to infer themes that could be important for learners, and (4) if we can use these models to effectively characterize MOOCs based on the open reviews. Results show that numeric ratings are clearly biased (63\% of them are 5-star ratings), and the topic modeling reveals some interesting topics related with course advertisements, the real applicability, or the difficulty of the different courses. We expect our study to shed some light on the area and promote a more transparent approach in online education reviews, which are becoming more and more popular as we enter the post-pandemic era.
翻译:最近发生的流行病改变了我们看待教育的方式。 毫不奇怪,儿童和大学生并不是唯一使用在线教育的人。 上百万成年人在过去几年里已经报名参加在线课程和课程,而像Lunera或edX这样的MOOC供应商正在报告其平台上注册的数百万新用户。 然而,学生在选择课程时确实面临一些挑战。 尽管在线审查系统在许多纵向标准,但在MOOC生态系统中不存在标准化或完全分散的审查系统。 本着这一精神,我们认为有机会利用现有的公开的MOOC审查来建立更简单、更透明的审查系统,让用户真正确定那里的最佳课程。 具体地说,在我们的研究中,我们从五个不同的平台上分析240万次审查(这是迄今为止使用的最大MOOC审查数据集),以便确定如下:(1) 数字评级为学生提供令人不安的信息,(2) 由NLP驱动的文本审查分析可以为学生提供宝贵的信息。 (3) 如果我们能够利用NLP驱动的发现技术来推导出一些对学习者来说很重要的简单主题,使学习者能够真正地确定那里的最佳课程。 (4) 如果我们能够有效地用这些模型来展示与评级,那么,那么,那么,那么,那么,那么,在网上的排名中,那么,那么,那么,我们就可以用这种评评分级的排名中的评则会是更令人看。