We present a study on sentence-level factuality and bias of news articles across domains. While prior work in NLP has mainly focused on predicting the factuality of article-level news reporting and political-ideological bias of news media, we investigated the effects of framing bias in factual reporting across domains so as to predict factuality and bias at the sentence level, which may explain more accurately the overall reliability of the entire document. First, we manually produced a large sentence-level annotated dataset, titled FactNews, composed of 6,191 sentences from 100 news stories by three different outlets, resulting in 300 news articles. Further, we studied how biased and factual spans surface in news articles from different media outlets and different domains. Lastly, a baseline model for factual sentence prediction was presented by fine-tuning BERT. We also provide a detailed analysis of data demonstrating the reliability of the annotation and models.
翻译:我们提出了一份关于判决一级事实和新闻文章跨领域偏向的研究。虽然国家劳工局以前的工作主要侧重于预测文章一级新闻报道和政治意识形态媒体偏向的实际情况,但我们调查了在跨领域事实报道中设置偏见的影响,以便在判决一级预测事实质量和偏向,这可以更准确地解释整个文件的总体可靠性。首先,我们手工制作了一个大型判决一级附加说明数据集,题为《事实新闻》,由三个不同媒体100个新闻报道的6 191个判决组成,产生了300篇新闻报道。此外,我们研究了不同媒体和不同领域的新闻文章中如何存有偏见和事实跨越表面。最后,对BERT进行微调后,提出了一个事实判决预测基线模型。我们还详细分析了显示说明说明和模型可靠性的数据。