Social media users who report content are key allies in the management of online misinformation, however, no research has been conducted yet to understand their role and the different trends underlying their reporting activity. We suggest an original approach to studying misinformation: examining it from the reporting users perspective at the content-level and comparatively across regions and platforms. We propose the first classification of reported content pieces, resulting from a review of c. 9,000 items reported on Facebook and Instagram in France, the UK, and the US in June 2020. This allows us to observe meaningful distinctions regarding reporting content between countries and platforms as it significantly varies in volume, type, topic, and manipulation technique. Examining six of these techniques, we identify a novel one that is specific to Instagram US and significantly more sophisticated than others, potentially presenting a concrete challenge for algorithmic detection and human moderation. We also identify four reporting behaviours, from which we derive four types of noise capable of explaining half of the inaccuracy found in content reported as misinformation. We finally show that breaking down the user reporting signal into a plurality of behaviours allows to train a simple, although competitive, classifier on a small dataset with a combination of basic users-reports to classify the different types of reported content pieces.
翻译:报告内容的社交媒体用户是管理在线错误信息的关键盟友,然而,尚未开展任何研究来了解他们的作用和报告活动的不同趋势。我们建议采用最初的方法来研究错误信息:从内容层面以及各区域和平台之间的相对比较的角度,从报告用户的角度研究错误信息。我们建议对报告内容进行首次分类,这是在2020年6月法国、英国和美国脸书和Instagram上对报告内容的共9,000项进行审查的结果。这使我们能够观察到国家和平台之间在报告内容方面存在有意义的区别,因为其数量、类型、主题和操作技术等差异很大。我们研究了其中的六种技术,我们确定了一种新颖的方法,它针对Instagram美国,比其他技术要复杂得多,可能对算法检测和人文调适度提出具体挑战。我们还确定了四种报告行为,从中我们可以产生四类噪音,能够解释所报告内容中发现的不准确性的一半。我们最后表明,将用户报告信号破碎成多种行为,能够对小型数据集进行简单但有竞争力的分类,同时将基本用户报告内容加以分类,对不同类型进行分类。