With the onset of the COVID-19 pandemic, news outlets and social media have become central tools for disseminating and consuming information. Because of their ease of access, users seek COVID-19-related information from online social media (i.e., online news) and news outlets (i.e., offline news). Online and offline news are often connected, sharing common topics while each has unique, different topics. A gap between these two news sources can lead to misinformation propagation. For instance, according to the Guardian, most COVID-19 misinformation comes from users on social media. Without fact-checking social media news, misinformation can lead to health threats. In this paper, we focus on the novel problem of bridging the gap between online and offline data by monitoring their common and distinct topics generated over time. We employ Twitter (online) and local news (offline) data for a time span of two years. Using online matrix factorization, we analyze and study online and offline COVID-19-related data differences and commonalities. We design experiments to show how online and offline data are linked together and what trends they follow.
翻译:随着COVID-19大流行的开始,新闻渠道和社交媒体已成为传播和消费信息的中央工具,用户由于容易获得,从在线社交媒体(即在线新闻)和新闻渠道(即离线新闻)寻求与COVID-19有关的信息;在线和离线新闻往往相互连接,共享共同的主题,而这两个新闻来源之间则有不同的独特主题;这两个新闻来源之间存在差距,可能导致错误信息传播。例如,根据《卫报》,大多数COVID-19错误信息来自社交媒体的用户。如果不对社交媒体新闻进行事实核查,错误信息可能导致健康威胁。在本文件中,我们把重点放在通过监测其长期产生的共同和不同主题来弥合在线和离线数据之间的差距这个新问题上,我们使用Twitter(在线)和地方新闻(离线)数据,为期两年。我们利用在线矩阵要素化,分析和研究与COVID-19有关的在线和离线数据差异和共性。我们设计实验,以显示在线和离线数据是如何连接的,以及它们遵循什么趋势。