普及到未来:减轻假新闻探测中的实体偏见 (Generalizing to the Future: Mitigating Entity Bias in Fake News Detection) - 专知论文

会员服务 ·

0

entity · 有偏 · MoDELS · 泛化理论 · Extensibility ·

2022 年 4 月 20 日

Generalizing to the Future: Mitigating Entity Bias in Fake News Detection

翻译：普及到未来:减轻假新闻探测中的实体偏见

Yongchun Zhu,Qiang Sheng,Juan Cao,Shuokai Li,Danding Wang,Fuzhen Zhuang

from arxiv, Accepted by SIGIR 2022

The wide dissemination of fake news is increasingly threatening both individuals and society. Fake news detection aims to train a model on the past news and detect fake news of the future. Though great efforts have been made, existing fake news detection methods overlooked the unintended entity bias in the real-world data, which seriously influences models' generalization ability to future data. For example, 97\% of news pieces in 2010-2017 containing the entity `Donald Trump' are real in our data, but the percentage falls down to merely 33\% in 2018. This would lead the model trained on the former set to hardly generalize to the latter, as it tends to predict news pieces about `Donald Trump' as real for lower training loss. In this paper, we propose an entity debiasing framework (\textbf{ENDEF}) which generalizes fake news detection models to the future data by mitigating entity bias from a cause-effect perspective. Based on the causal graph among entities, news contents, and news veracity, we separately model the contribution of each cause (entities and contents) during training. In the inference stage, we remove the direct effect of the entities to mitigate entity bias. Extensive offline experiments on the English and Chinese datasets demonstrate that the proposed framework can largely improve the performance of base fake news detectors, and online tests verify its superiority in practice. To the best of our knowledge, this is the first work to explicitly improve the generalization ability of fake news detection models to the future data. The code has been released at https://github.com/ICTMCG/ENDEF-SIGIR2022.

翻译：假新闻的广泛传播正在日益威胁个人和社会。假新闻检测旨在对过去新闻的模型进行培训,并探测未来假新闻。虽然已经做出了巨大的努力,但现有的假新闻检测方法忽略了真实世界数据中的意外实体偏差,这严重影响了模型对未来数据的概括能力。例如,2010-2017年97份包含实体“Donald Trump”的新闻文章在我们的数据中是真实的,但百分比在2018年下降到仅仅33 ⁇ 。这将导致在前一套中训练的模型几乎不向后一套推广,因为它往往预测“Donald Trump”是真实的培训损失。在本文中,我们提议了一个实体去掉偏见的框架(\ textb{ENDEF}),通过从因果关系角度减少实体对结果的偏差,将假新闻检测模型与未来数据进行概括。根据实体之间的因果关系图、新闻内容和新闻真实性,我们在培训中分别将每种原因(内容和内容)的解析出来。在第一个阶段,我们大幅提升了“Dlefard Trual ” 能力,我们在模拟的测试中可以减少其基础上的数据测试。我们未来的实体对结果的测试。在英国测试中可以明确地展示。

0

相关内容

entity

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

“核HO-1”调控miRNA-125a-5p影响血脊髓屏障结构和功能的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Insulicolide A的全合成和结构优化

国家自然科学基金

0+阅读 · 2014年12月31日

气固非催化反应中固体产物介尺度结构的形成与生长

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于实例空间压缩的minsum目标的平行机在线排序研究

国家自然科学基金

0+阅读 · 2012年12月31日

ROS在调控心肌衰老过程中Beclin 1-Vps34复合体功能和自噬流的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

应用于RF MEMS开关的新型减振合金薄膜材料中孪晶行为的研究

国家自然科学基金

0+阅读 · 2012年12月31日

肝素酶过表达与脓毒症血管内皮功能不全的相关性研究

国家自然科学基金

0+阅读 · 2012年12月31日

缺氧时HIF-1α转录激活自噬蛋白Beclin 1促进鼻咽癌转移机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

Mitigating Modality Collapse in Multimodal VAEs via Impartial Optimization

Arxiv

0+阅读 · 2022年6月9日

The Missing Link: Finding label relations across datasets

The Missing Link: Finding label relations across datasets

Arxiv

0+阅读 · 2022年6月9日

Estimation of Predictive Performance in High-Dimensional Data Settings using Learning Curves

Arxiv

0+阅读 · 2022年6月8日

Using Mixed-Effect Models to Learn Bayesian Networks from Related Data Sets

Arxiv

0+阅读 · 2022年6月8日

DebiasBench: Benchmark for Fair Comparison of Debiasing in Image Classification

Arxiv

0+阅读 · 2022年6月8日

Delving into the Pre-training Paradigm of Monocular 3D Object Detection

Arxiv

0+阅读 · 2022年6月8日

Generalized Out-of-Distribution Detection: A Survey

Generalized Out-of-Distribution Detection: A Survey

Arxiv

15+阅读 · 2021年10月21日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

VIP会员

文章信息

相关主题

相关VIP内容

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

基于大型语言模型的网络威胁情报：利用LLM提取MITRE ATT&CK技术 | 最新文献

无人机（UAV）战略：区域大国与暴力非国家行为体在中东冲突中对无人机的运用 | 130页

神经技术与未来无人机战争的交汇点 | 最新报告

美国从“蛛网行动”中汲取轰炸机舰队保护教训

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Mitigating Modality Collapse in Multimodal VAEs via Impartial Optimization

Arxiv

0+阅读 · 2022年6月9日

The Missing Link: Finding label relations across datasets

The Missing Link: Finding label relations across datasets

Arxiv

0+阅读 · 2022年6月9日

Estimation of Predictive Performance in High-Dimensional Data Settings using Learning Curves

Arxiv

0+阅读 · 2022年6月8日

Using Mixed-Effect Models to Learn Bayesian Networks from Related Data Sets

Arxiv

0+阅读 · 2022年6月8日

DebiasBench: Benchmark for Fair Comparison of Debiasing in Image Classification

Arxiv

0+阅读 · 2022年6月8日

Delving into the Pre-training Paradigm of Monocular 3D Object Detection

Arxiv

0+阅读 · 2022年6月8日

Generalized Out-of-Distribution Detection: A Survey

Generalized Out-of-Distribution Detection: A Survey

Arxiv

15+阅读 · 2021年10月21日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Meta-Learning to Cluster

Meta-Learning to Cluster

Arxiv

17+阅读 · 2019年10月30日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

相关基金

“核HO-1”调控miRNA-125a-5p影响血脊髓屏障结构和功能的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Insulicolide A的全合成和结构优化

国家自然科学基金

0+阅读 · 2014年12月31日

气固非催化反应中固体产物介尺度结构的形成与生长

国家自然科学基金

0+阅读 · 2013年12月31日

Intraflagellar Transport运输纤毛蛋白的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

基于实例空间压缩的minsum目标的平行机在线排序研究

国家自然科学基金

0+阅读 · 2012年12月31日

ROS在调控心肌衰老过程中Beclin 1-Vps34复合体功能和自噬流的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

应用于RF MEMS开关的新型减振合金薄膜材料中孪晶行为的研究

国家自然科学基金

0+阅读 · 2012年12月31日

肝素酶过表达与脓毒症血管内皮功能不全的相关性研究

国家自然科学基金

0+阅读 · 2012年12月31日

缺氧时HIF-1α转录激活自噬蛋白Beclin 1促进鼻咽癌转移机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员