KPI-EDGAR: 用于从财务文件中提取关系的新数据集和配套矩阵 (KPI-EDGAR: A Novel Dataset and Accompanying Metric for Relation Extraction from Financial Documents) - 专知论文

会员服务 ·

0

entity · 数据集 · Analysis · Performer · Better ·

2022 年 10 月 17 日

KPI-EDGAR: A Novel Dataset and Accompanying Metric for Relation Extraction from Financial Documents

翻译：KPI-EDGAR: 用于从财务文件中提取关系的新数据集和配套矩阵

Tobias Deußer,Syed Musharraf Ali,Lars Hillebrand,Desiana Nurchalifah,Basil Jacob,Christian Bauckhage,Rafet Sifa

from arxiv, Accepted at ICMLA 2022, 6 pages, 5 tables

We introduce KPI-EDGAR, a novel dataset for Joint Named Entity Recognition and Relation Extraction building on financial reports uploaded to the Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system, where the main objective is to extract Key Performance Indicators (KPIs) from financial documents and link them to their numerical values and other attributes. We further provide four accompanying baselines for benchmarking potential future research. Additionally, we propose a new way of measuring the success of said extraction process by incorporating a word-level weighting scheme into the conventional F1 score to better model the inherently fuzzy borders of the entity pairs of a relation in this domain.

翻译：我们引入了KPI-EDGAR,这是在上载到电子数据收集、分析和检索系统(EDGAR)的财务报告基础上联合命名实体确认和采掘的一套新颖的数据集,其主要目的是从财务文件中提取关键业绩指标,并将其与数值和其他属性联系起来;我们进一步为今后可能的研究基准设定了四个相伴随的基准;此外,我们提出了一种新的衡量上述提取过程成功与否的方法,在常规的F1评分中纳入一个字级加权办法,以便更好地模拟与该领域关系实体对立的内在模糊边界。

0

相关内容

entity

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

溶剂热法FeSe基超导材料制备和物性研究

国家自然科学基金

0+阅读 · 2014年12月31日

胶质母细胞瘤干性起源的分子生物学研究

国家自然科学基金

0+阅读 · 2012年12月31日

长链非编码RNACUST52998调控T细胞的功能及其对SLE的作用和机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

荷人卵巢上皮癌裸鼠冻融卵巢组织移植的安全性研究

国家自然科学基金

0+阅读 · 2009年12月31日

小麦旗叶衰老相关基因的克隆及功能分析

国家自然科学基金

0+阅读 · 2009年12月31日

Method for Determining the Similarity of Text Documents for the Kazakh language, Taking Into Account Synonyms: Extension to TF-IDF

Arxiv

0+阅读 · 2022年11月22日

Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution

Arxiv

10+阅读 · 2021年1月24日

Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction

Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction

Arxiv

11+阅读 · 2019年9月23日

Global Relation Embedding for Relation Extraction

Arxiv

10+阅读 · 2018年4月19日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

VIP会员

文章信息

相关主题

相关VIP内容

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《基于AI的动态任务分配策略实现多智能体系统有意义人类控制》报告

《超越连接：AI驱动网络未来愿景》最新报告

人工智能赋能多域作战：能力与挑战

《战场空间决策优势：AI基础与应用研究》总结报告

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

相关论文

Method for Determining the Similarity of Text Documents for the Kazakh language, Taking Into Account Synonyms: Extension to TF-IDF

Arxiv

0+阅读 · 2022年11月22日

Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution

Arxiv

10+阅读 · 2021年1月24日

Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction

Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction

Arxiv

11+阅读 · 2019年9月23日

Global Relation Embedding for Relation Extraction

Arxiv

10+阅读 · 2018年4月19日

Zero-Shot Transfer Learning for Event Extraction

Arxiv

10+阅读 · 2017年7月4日

相关基金

溶剂热法FeSe基超导材料制备和物性研究

国家自然科学基金

0+阅读 · 2014年12月31日

胶质母细胞瘤干性起源的分子生物学研究

国家自然科学基金

0+阅读 · 2012年12月31日

长链非编码RNACUST52998调控T细胞的功能及其对SLE的作用和机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

荷人卵巢上皮癌裸鼠冻融卵巢组织移植的安全性研究

国家自然科学基金

0+阅读 · 2009年12月31日

小麦旗叶衰老相关基因的克隆及功能分析

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员