Growing polarization of the news media has been blamed for fanning disagreement, controversy and even violence. Early identification of polarized topics is thus an urgent matter that can help mitigate conflict. However, accurate measurement of topic-wise polarization is still an open research challenge. To address this gap, we propose Partisanship-aware Contextualized Topic Embeddings (PaCTE), a method to automatically detect polarized topics from partisan news sources. Specifically, utilizing a language model that has been finetuned on recognizing partisanship of the news articles, we represent the ideology of a news corpus on a topic by corpus-contextualized topic embedding and measure the polarization using cosine distance. We apply our method to a dataset of news articles about the COVID-19 pandemic. Extensive experiments on different news sources and topics demonstrate the efficacy of our method to capture topical polarization, as indicated by its effectiveness of retrieving the most polarized topics.
翻译:新闻媒体日益两极分化是煽动分歧、争议甚至暴力的罪魁祸首。因此,及早确定两极化议题是一个紧急事项,有助于缓解冲突。然而,准确衡量专题两极化仍然是一项公开的研究挑战。为了解决这一差距,我们提议采用Partisantship-abilation-acifical Afficial化专题嵌入(PacTE),这是从党派新闻来源自动探测两极化议题的一种方法。具体地说,我们使用一种语言模式,在承认新闻文章的党派性方面进行了微调,我们代表着一个主题的新闻资料库的意识形态,通过分层化专题嵌入并用 Cosine 距离衡量两极化。我们运用了我们的方法,将关于COVID-19大流行的新闻文章的数据集用于。关于不同新闻来源和专题的广泛实验展示了我们捕捉时分化主题性议题的方法的功效,这体现在重新研究最两极化的专题的有效性。