Studying misinformation and how to deal with unhealthy behaviours within online discussions has recently become an important field of research within social studies. With the rapid development of social media, and the increasing amount of available information and sources, rigorous manual analysis of such discourses has become unfeasible. Many approaches tackle the issue by studying the semantic and syntactic properties of discussions following a supervised approach, for example using natural language processing on a dataset labeled for abusive, fake or bot-generated content. Solutions based on the existence of a ground truth are limited to those domains which may have ground truth. However, within the context of misinformation, it may be difficult or even impossible to assign labels to instances. In this context, we consider the use of temporal dynamic patterns as an indicator of discussion health. Working in a domain for which ground truth was unavailable at the time (early COVID-19 pandemic discussions) we explore the characterization of discussions based on the the volume and time of contributions. First we explore the types of discussions in an unsupervised manner, and then characterize these types using the concept of ephemerality, which we formalize. In the end, we discuss the potential use of our ephemerality definition for labeling online discourses based on how desirable, healthy and constructive they are.
翻译:在网上讨论中研究错误和如何处理不健康行为最近已成为社会研究的一个重要研究领域。随着社交媒体的迅速发展以及现有信息和来源的不断增加,对此类言论的严格人工分析已变得不可行。许多方法通过研究在监督方法下讨论的语义和综合特性来解决这个问题,例如利用自然语言处理标记为滥用、伪造或机器人产生的内容的数据集。基于存在地面真相的解决办法仅限于那些可能具有地面真相的领域。然而,在错误信息的背景下,可能很难甚至不可能为实例指定标签。在这方面,我们认为使用时间动态模式作为讨论健康的一个指标。在一个当时没有实地真相的领域开展工作(早期COVID-19大流行讨论),我们根据贡献的数量和时间来探讨讨论的特征。首先,我们探索以不精确的方式进行讨论的种类,然后用我们正式确定的简短概念来描述这些类型。我们如何利用时间动态模式作为讨论健康概念,我们如何在网上讨论如何以建设性的方式使用我们理想的标签。我们如何在网上进行建设性的讨论。