Growing polarization of the news media has been blamed for fanning disagreement, controversy and even violence. Early identification of polarized topics is thus an urgent matter that can help mitigate conflict. However, accurate measurement of polarization is still an open research challenge. To address this gap, we propose Partisanship-aware Contextualized Topic Embeddings (PaCTE), a method to automatically detect polarized topics from partisan news sources. Specifically, we represent the ideology of a news source on a topic by corpus-contextualized topic embedding utilizing a language model that has been finetuned on recognizing partisanship of the news articles, and measure the polarization between sources using cosine similarity. We apply our method to a corpus of news about COVID-19 pandemic. Extensive experiments on different news sources and topics demonstrate the effectiveness of our method to precisely capture the topical polarization and alignment between different news sources. To help clarify and validate results, we explain the polarization using the Moral Foundation Theory.
翻译:新闻媒体日益两极分化是煽动分歧、争议甚至暴力的罪魁祸首。因此,早期确定两极化议题是一个紧急事项,有助于缓解冲突。然而,准确衡量两极分化仍然是一项公开的研究挑战。为了解决这一差距,我们提议采用Partisanship-abilation-accilate-appical化专题嵌入(PaCTE),这是从党派新闻源中自动发现两极化议题的一种方法。具体地说,我们代表一个主题新闻源的意识形态,它通过文件-翻版化专题嵌入一个专题,嵌入一个语言模式,该语言模式已经对承认新闻文章的党派倾向性进行了精确调整,并用共生性相似性衡量来源之间的两极分化。我们运用了我们的方法收集有关COVID-19大流行的新闻。关于不同新闻来源和专题的广泛实验展示了我们准确捕捉不同新闻源之间主题两极化和一致的方法的有效性。为了帮助澄清和验证结果,我们用Moral Found Tery来解释两极化。