The role played by YouTube's recommendation algorithm in unwittingly promoting misinformation and conspiracy theories is not entirely understood. Yet, this can have dire real-world consequences, especially when pseudoscientific content is promoted to users at critical times, such as the COVID-19 pandemic. In this paper, we set out to characterize and detect pseudoscientific misinformation on YouTube. We collect 6.6K videos related to COVID-19, the Flat Earth theory, as well as the anti-vaccination and anti-mask movements. Using crowdsourcing, we annotate them as pseudoscience, legitimate science, or irrelevant and train a deep learning classifier to detect pseudoscientific videos with an accuracy of 0.79. We quantify user exposure to this content on various parts of the platform and how this exposure changes based on the user's watch history. We find that YouTube suggests more pseudoscientific content regarding traditional pseudoscientific topics (e.g., flat earth, anti-vaccination) than for emerging ones (like COVID-19). At the same time, these recommendations are more common on the search results page than on a user's homepage or when actively watching videos. Finally, we shed light on how a user's watch history substantially affects the type of recommended videos.
翻译:YouTube的建议算法在不知情地宣传错误和阴谋理论方面所起的作用还没有得到完全理解。然而,这可能带来可怕的现实后果,特别是当假科学内容在诸如COVID-19大流行等关键时刻被向用户宣传时。在这份文件中,我们开始在YouTube上描述和检测假科学错误信息。我们收集了与COVID-19、平地理论以及反接种和反烟幕运动有关的6.6K视频。利用众包,我们将它们列为伪科学、合法科学或无关紧要,并训练了一位深层学习的分类师,以精确度为0.79来检测假科学视频。我们量化了用户在平台各个部分接触这种假科学内容的情况,以及根据用户观看的历史变化的方式。我们发现,YouTube在传统伪科学专题(如平地、反接种)方面比新兴专题(如COVID-19)的假科学内容更常见。 同时,这些建议在搜索结果页面上比在用户浏览的网页上更加常见,或者在积极观看历史时,我们如何大量地浏览。