The role played by YouTube's recommendation algorithm in unwittingly promoting misinformation and conspiracy theories is not entirely understood. Yet, this can have dire real-world consequences, especially when pseudoscientific content is promoted to users at critical times, such as the COVID-19 pandemic. In this paper, we set out to characterize and detect pseudoscientific misinformation on YouTube. We collect 6.6K videos related to COVID-19, the Flat Earth theory, as well as the anti-vaccination and anti-mask movements. Using crowdsourcing, we annotate them as pseudoscience, legitimate science, or irrelevant and train a deep learning classifier to detect pseudoscientific videos with an accuracy of 0.79. We quantify user exposure to this content on various parts of the platform and how this exposure changes based on the user's watch history. We find that YouTube suggests more pseudoscientific content regarding traditional pseudoscientific topics (e.g., flat earth, anti-vaccination) than for emerging ones (like COVID-19). At the same time, these recommendations are more common on the search results page than on a user's homepage or in the recommendation section when actively watching videos. Finally, we shed light on how a user's watch history substantially affects the type of recommended videos.
翻译:YouTube的建议算法在无意中宣传错误消息和阴谋论方面所起的作用还没有得到完全理解。然而,这可能带来可怕的现实后果,特别是在诸如COVID-19大流行等关键时刻向用户宣传伪科学内容的情况下。在这份文件中,我们开始对YouTube上伪科学错误信息进行定性和检测。我们收集了与COVID-19、平地理论以及反接种和反烟雾运动有关的6.6K视频。利用众包,我们将它们列为伪科学、合法科学或无关紧要者,并训练了一位深层学习的分类师,以精确度为0.79来检测假科学视频。我们量化了用户在平台各个部分接触这种假科学内容的情况,以及根据用户观看的历史变化情况如何。我们发现,YouTube在传统伪科学专题(如平地、反接种)方面比新兴专题(如COVID-19)的假科学内容要高得多。 同时,这些建议在搜索结果页面上比在用户浏览的网页上更为常见,我们最终在浏览用户网页或推荐的图像中如何积极影响历史。