GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles - 专知论文

会员服务 ·

0

多样性 · MoDELS · 数据集 · 泛化理论 · HTTPS ·

2023 年 6 月 1 日

GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles

翻译：暂无翻译

Tanmay Parekh,I-Hung Hsu,Kuan-Hao Huang,Kai-Wei Chang,Nanyun Peng

from arxiv, Accepted at ACL 2023 main conference

Recent works in Event Argument Extraction (EAE) have focused on improving model generalizability to cater to new events and domains. However, standard benchmarking datasets like ACE and ERE cover less than 40 event types and 25 entity-centric argument roles. Limited diversity and coverage hinder these datasets from adequately evaluating the generalizability of EAE models. In this paper, we first contribute by creating a large and diverse EAE ontology. This ontology is created by transforming FrameNet, a comprehensive semantic role labeling (SRL) dataset for EAE, by exploiting the similarity between these two tasks. Then, exhaustive human expert annotations are collected to build the ontology, concluding with 115 events and 220 argument roles, with a significant portion of roles not being entities. We utilize this ontology to further introduce GENEVA, a diverse generalizability benchmarking dataset comprising four test suites, aimed at evaluating models' ability to handle limited data and unseen event type generalization. We benchmark six EAE models from various families. The results show that owing to non-entity argument roles, even the best-performing model can only achieve 39% F1 score, indicating how GENEVA provides new challenges for generalization in EAE. Overall, our large and diverse EAE ontology can aid in creating more comprehensive future resources, while GENEVA is a challenging benchmarking dataset encouraging further research for improving generalizability in EAE. The code and data can be found at https://github.com/PlusLabNLP/GENEVA.

翻译：暂无翻译

0

相关内容

多样性

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

专知

55+阅读 · 2018年1月28日

【ACM MM论文集】国际多媒体顶级会议ACM Multimedia 2017 Open Access Repository

【ACM MM论文集】国际多媒体顶级会议ACM Multimedia 2017 Open Access Repository

专知

13+阅读 · 2017年10月17日

HIF-1/COMPASS调控缺氧诱导Brg1和Brm表达上调的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA在细颗粒物（PM2.5）诱导肺癌发生的作用与机制

国家自然科学基金

0+阅读 · 2014年12月31日

多房棘球绦虫Argonaute蛋白新类群在小RNA诱导的沉默途径中的功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

lincRNA-cox2在结核菌感染中调节细胞自噬功能的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Tim-3影响肾癌患者细胞免疫功能调节机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

缺氧/复氧致DNA甲基化协同调控的滋养细胞间质转化（EMT）障碍在子痫前期发生中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

纳米线/形状记忆合金复合材料制备及其结构功能特性

国家自然科学基金

0+阅读 · 2011年12月31日

大气细颗粒物对冠状动脉粥样硬化的免疫损伤机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

高效NH3-SCR催化剂的设计与合成

国家自然科学基金

0+阅读 · 2011年12月31日

深海细菌Pseudomonas marincola的氧化还原酶高通量挖掘、特异性研究与应用

国家自然科学基金

0+阅读 · 2011年12月31日

Benchmark datasets for biomedical knowledge graphs with negative statements

Arxiv

0+阅读 · 2023年7月21日

Advancing Visual Grounding with Scene Knowledge: Benchmark and Method

Arxiv

0+阅读 · 2023年7月21日

VERITE: A Robust Benchmark for Multimodal Misinformation Detection Accounting for Unimodal Bias

Arxiv

0+阅读 · 2023年7月21日

Robust Visual Question Answering: Datasets, Methods, and Future Challenges

Arxiv

4+阅读 · 2023年7月21日

Energy-Efficient Softwarized Networks: A Survey

Arxiv

0+阅读 · 2023年7月21日

A benchmark of categorical encoders for binary classification

Arxiv

0+阅读 · 2023年7月19日

Distilling Large Vision-Language Model with Out-of-Distribution Generalizability

Arxiv

0+阅读 · 2023年7月19日

Reasoning over Different Types of Knowledge Graphs: Static, Temporal and Multi-Modal

Arxiv

21+阅读 · 2022年12月12日

CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

Arxiv

10+阅读 · 2020年10月6日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

相关VIP内容

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

多模态大语言模型下游调优中“保持自我”的重要性

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

【论文推荐】最新5篇情感分析相关论文—深度学习情感分析综述、情感分析语料库、情感预测性、上下文和位置感知的因子分解模型、LSTM

专知

55+阅读 · 2018年1月28日

【ACM MM论文集】国际多媒体顶级会议ACM Multimedia 2017 Open Access Repository

【ACM MM论文集】国际多媒体顶级会议ACM Multimedia 2017 Open Access Repository

专知

13+阅读 · 2017年10月17日

相关论文

Benchmark datasets for biomedical knowledge graphs with negative statements

Arxiv

0+阅读 · 2023年7月21日

Advancing Visual Grounding with Scene Knowledge: Benchmark and Method

Arxiv

0+阅读 · 2023年7月21日

VERITE: A Robust Benchmark for Multimodal Misinformation Detection Accounting for Unimodal Bias

Arxiv

0+阅读 · 2023年7月21日

Robust Visual Question Answering: Datasets, Methods, and Future Challenges

Arxiv

4+阅读 · 2023年7月21日

Energy-Efficient Softwarized Networks: A Survey

Arxiv

0+阅读 · 2023年7月21日

A benchmark of categorical encoders for binary classification

Arxiv

0+阅读 · 2023年7月19日

Distilling Large Vision-Language Model with Out-of-Distribution Generalizability

Arxiv

0+阅读 · 2023年7月19日

Reasoning over Different Types of Knowledge Graphs: Static, Temporal and Multi-Modal

Arxiv

21+阅读 · 2022年12月12日

CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

Arxiv

10+阅读 · 2020年10月6日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

HIF-1/COMPASS调控缺氧诱导Brg1和Brm表达上调的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA在细颗粒物（PM2.5）诱导肺癌发生的作用与机制

国家自然科学基金

0+阅读 · 2014年12月31日

多房棘球绦虫Argonaute蛋白新类群在小RNA诱导的沉默途径中的功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

lincRNA-cox2在结核菌感染中调节细胞自噬功能的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Tim-3影响肾癌患者细胞免疫功能调节机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

缺氧/复氧致DNA甲基化协同调控的滋养细胞间质转化（EMT）障碍在子痫前期发生中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

纳米线/形状记忆合金复合材料制备及其结构功能特性

国家自然科学基金

0+阅读 · 2011年12月31日

大气细颗粒物对冠状动脉粥样硬化的免疫损伤机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

高效NH3-SCR催化剂的设计与合成

国家自然科学基金

0+阅读 · 2011年12月31日

深海细菌Pseudomonas marincola的氧化还原酶高通量挖掘、特异性研究与应用

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员