重新审阅文件 -- -- 解决关系采掘中的虚假消极问题 (Revisiting DocRED -- Addressing the False Negative Problem in Relation Extraction) - 专知论文

会员服务 ·

0

假阴性 · 数据集 · Extensibility · 可辨认的 · state-of-the-art ·

2022 年 10 月 25 日

Revisiting DocRED -- Addressing the False Negative Problem in Relation Extraction

翻译：重新审阅文件 -- -- 解决关系采掘中的虚假消极问题

Qingyu Tan,Lu Xu,Lidong Bing,Hwee Tou Ng,Sharifah Mahani Aljunied

from arxiv, Accepted by EMNLP 2022

The DocRED dataset is one of the most popular and widely used benchmarks for document-level relation extraction (RE). It adopts a recommend-revise annotation scheme so as to have a large-scale annotated dataset. However, we find that the annotation of DocRED is incomplete, i.e., false negative samples are prevalent. We analyze the causes and effects of the overwhelming false negative problem in the DocRED dataset. To address the shortcoming, we re-annotate 4,053 documents in the DocRED dataset by adding the missed relation triples back to the original DocRED. We name our revised DocRED dataset Re-DocRED. We conduct extensive experiments with state-of-the-art neural models on both datasets, and the experimental results show that the models trained and evaluated on our Re-DocRED achieve performance improvements of around 13 F1 points. Moreover, we conduct a comprehensive analysis to identify the potential areas for further improvement. Our dataset is publicly available at https://github.com/tonytan48/Re-DocRED.

翻译：DocRED数据集是文件级关系提取(RE)中最受欢迎和最广泛使用的基准之一。它采用建议性复读注释办法,以便有一个大规模附加说明的数据集。然而,我们发现DocRED的注解不完整,即虚伪的样本很普遍。我们分析了DocRED数据集中压倒性的虚假负面问题的原因和影响。为了解决缺陷,我们重新在DocRED数据集中重新公布4 053份文件,将错失的关联数增加到原来的DocRED中。我们用我们的名字命名了我们订正的DocRED数据集重新-DocRED。我们对两个数据集都进行了最新的神经模型进行广泛的实验,实验结果显示,对我们的Re-DocRED进行训练和评价的模型在13个F1点上实现了性能改进。此外,我们进行了全面分析,以确定可能进一步改进的领域。我们的数据集可在https://github.com/tonytan48/Re-DocRED上公开查阅。

0

相关内容

假阴性

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

Decorin对急性缺血性卒中后血脑屏障中ZO-1蛋白的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

miRNAs/mTOR调控网络在糖尿病雷帕霉素抵抗中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

中低温固体氧化物燃料电池核-壳结构阴极的研究

国家自然科学基金

0+阅读 · 2014年12月31日

磁性铁氧体@类普鲁士蓝核壳结构纳米复合材料的自模板法合成及性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Cu/TiN核壳结构复合纳米纤维的可控合成及性能

国家自然科学基金

0+阅读 · 2013年12月31日

Prohibitin调控癌组织内源性雄激素合成促进前列腺癌激素抵抗性进展机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Linked Open Data的Web服务语义互操作关键技术

国家自然科学基金

0+阅读 · 2012年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

Representing LLVM-IR in a Code Property Graph

Arxiv

0+阅读 · 2022年12月9日

Self-training via Metric Learning for Source-Free Domain Adaptation of Semantic Segmentation

Arxiv

0+阅读 · 2022年12月8日

A Study on Extracting Named Entities from Fine-tuned vs. Differentially Private Fine-tuned BERT Models

Arxiv

0+阅读 · 2022年12月7日

False Negative Reduction in Semantic Segmentation under Domain Shift using Depth Estimation

Arxiv

0+阅读 · 2022年12月7日

An integrated approach to test for missing not at random

Arxiv

0+阅读 · 2022年12月7日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

Estimating Node Importance in Knowledge Graphs Using Graph Neural Networks

Arxiv

25+阅读 · 2019年5月21日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Arxiv

15+阅读 · 2018年5月24日

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Arxiv

10+阅读 · 2018年3月29日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军徒步机动作战条令手册》最新168页

【博士论文】基于不确定性的可靠性：现代机器学习中的选择性预测与可信部署

军事后勤数字化未来展望

《美海军后勤体系整合与创新挑战》最新报告

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Representing LLVM-IR in a Code Property Graph

Arxiv

0+阅读 · 2022年12月9日

Self-training via Metric Learning for Source-Free Domain Adaptation of Semantic Segmentation

Arxiv

0+阅读 · 2022年12月8日

A Study on Extracting Named Entities from Fine-tuned vs. Differentially Private Fine-tuned BERT Models

Arxiv

0+阅读 · 2022年12月7日

False Negative Reduction in Semantic Segmentation under Domain Shift using Depth Estimation

Arxiv

0+阅读 · 2022年12月7日

An integrated approach to test for missing not at random

Arxiv

0+阅读 · 2022年12月7日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

Estimating Node Importance in Knowledge Graphs Using Graph Neural Networks

Arxiv

25+阅读 · 2019年5月21日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Arxiv

15+阅读 · 2018年5月24日

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Arxiv

10+阅读 · 2018年3月29日

相关基金

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

Decorin对急性缺血性卒中后血脑屏障中ZO-1蛋白的作用及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

miRNAs/mTOR调控网络在糖尿病雷帕霉素抵抗中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

中低温固体氧化物燃料电池核-壳结构阴极的研究

国家自然科学基金

0+阅读 · 2014年12月31日

磁性铁氧体@类普鲁士蓝核壳结构纳米复合材料的自模板法合成及性能研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

Cu/TiN核壳结构复合纳米纤维的可控合成及性能

国家自然科学基金

0+阅读 · 2013年12月31日

Prohibitin调控癌组织内源性雄激素合成促进前列腺癌激素抵抗性进展机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Linked Open Data的Web服务语义互操作关键技术

国家自然科学基金

0+阅读 · 2012年12月31日

Multi-Agent架构智能机器人推理机实时性研究

国家自然科学基金

1+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员