We address an important gap in detection of political bias in news articles. Previous works that perform supervised document classification can be biased towards the writing style of each news outlet, leading to overfitting and limited generalizability. Our approach overcomes this limitation by considering both the sentence-level semantics and the document-level rhetorical structure, resulting in a more robust and style-agnostic approach to detecting political bias in news articles. We introduce a novel multi-head hierarchical attention model that effectively encodes the structure of long documents through a diverse ensemble of attention heads. While journalism follows a formalized rhetorical structure, the writing style may vary by news outlet. We demonstrate that our method overcomes this domain dependency and outperforms previous approaches for robustness and accuracy. Further analysis demonstrates the ability of our model to capture the discourse structures commonly used in the journalism domain.
翻译:解耦结构与样式:通过引导文档层次结构检测新闻中的政治偏见
我们解决了新闻文章中政治偏见检测的重要差距。以往的有监督文档分类方法可能会对每个新闻机构的写作风格有偏见,导致过度拟合和有限的普适性。我们的方法通过考虑句子级语义和文档级修辞结构,从而克服了这种限制,得出了一种更加健壮和样式不可知的检测新闻中政治偏见的方法。我们引入了一种新颖的多头层次注意力模型,通过多样化的注意力头有效地编码长文档的结构。虽然新闻报道遵循规范化的修辞结构,但写作风格可能因新闻机构而异。我们证明了我们的方法克服了这种领域依赖性,并且在鲁棒性和准确性方面优于以往的方法。进一步的分析表明我们的模型能够捕捉到新闻报道中常用的话语结构。