努力实现广泛覆盖的命名实体资源:多种不同语文的数据效率办法 (Towards a Broad Coverage Named Entity Resource: A Data-Efficient Approach for Many Diverse Languages) - 专知论文

会员服务 ·

0

entity · 多样性 · 知识 (knowledge) · 讲稿 · Performer ·

2022 年 4 月 29 日

Towards a Broad Coverage Named Entity Resource: A Data-Efficient Approach for Many Diverse Languages

翻译：努力实现广泛覆盖的命名实体资源:多种不同语文的数据效率办法

Silvia Severini,Ayyoob Imani,Philipp Dufter,Hinrich Schütze

from arxiv, LREC 2022

Parallel corpora are ideal for extracting a multilingual named entity (MNE) resource, i.e., a dataset of names translated into multiple languages. Prior work on extracting MNE datasets from parallel corpora required resources such as large monolingual corpora or word aligners that are unavailable or perform poorly for underresourced languages. We present CLC-BN, a new method for creating an MNE resource, and apply it to the Parallel Bible Corpus, a corpus of more than 1000 languages. CLC-BN learns a neural transliteration model from parallel-corpus statistics, without requiring any other bilingual resources, word aligners, or seed data. Experimental results show that CLC-BN clearly outperforms prior work. We release an MNE resource for 1340 languages and demonstrate its effectiveness in two downstream tasks: knowledge graph augmentation and bilingual lexicon induction.

翻译：为了提取多语种名称实体(MNE)资源,即翻译成多种语言的地名数据集,理想的平行公司是提取多语种名称实体(MNE)资源。以前从平行公司提取MNE数据集的工作需要大量资源,如大型单一语言公司或对资源不足的语言来说不可用或表现不佳的单词匹配者。我们介绍了创建MNE资源的新方法CLC-BN, 并将其应用于具有1000多种语言的平行圣经公司(Corpus)。CLC-BN从平行公司统计数据中学习了一个神经转写模式,而不需要任何其他双语资源、词匹配者或种子数据。实验结果表明,CLC-BN显然超越了先前的工作。我们为1340种语言发放了MNE资源,并展示了其在两个下游任务(知识图表增强和双语词汇介绍)中的有效性。

0

相关内容

entity

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

白光LED用无稀土掺杂氮化硼基（BN-）氮（氧）化物荧光材料的研究

国家自然科学基金

0+阅读 · 2015年12月31日

高效脱除水中重金属离子的垂直取向介孔膜的合成与性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

肠浒苔（Enteromorpha intestinalis）多糖的结构和抗肿瘤活性随季节与地域变化规律研究

国家自然科学基金

0+阅读 · 2012年12月31日

最优传输问题与随机矩阵

国家自然科学基金

3+阅读 · 2012年12月31日

F/Cl+CHD3反应中反应物振动激发与取向的立体动力学理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

全国第十六届大环化学暨第八届超分子化学学术讨论会

国家自然科学基金

0+阅读 · 2012年2月28日

瞿麦诱导肿瘤细胞凋亡的活性成分及分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

稀土RE-Fe-Cr三元系相图及其化合物吸波性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

胶西北矿集区蚀变岩型金矿资源潜力评价分形方法研究

国家自然科学基金

0+阅读 · 2008年12月31日

REKnow: Enhanced Knowledge for Joint Entity and Relation Extraction

Arxiv

0+阅读 · 2022年6月20日

Open-Sampling: Exploring Out-of-Distribution data for Re-balancing Long-tailed datasets

Arxiv

0+阅读 · 2022年6月17日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Arxiv

13+阅读 · 2019年11月14日

Multi-view Knowledge Graph Embedding for Entity Alignment

Arxiv

36+阅读 · 2019年6月6日

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Arxiv

17+阅读 · 2018年6月5日

Global Relation Embedding for Relation Extraction

Arxiv

10+阅读 · 2018年4月19日

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Arxiv

21+阅读 · 2018年1月16日

Adversarial Learning for Chinese NER from Crowd Annotations

Arxiv

15+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

发射器定位中的传感器路径规划研究 | 235页

战略无人机 | 2025最新80页

蜂窝通信是否是无人机与无人地面战车主宰战场的关键？

无人机对机动战的影响 | 2025最新文献

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

REKnow: Enhanced Knowledge for Joint Entity and Relation Extraction

Arxiv

0+阅读 · 2022年6月20日

Open-Sampling: Exploring Out-of-Distribution data for Re-balancing Long-tailed datasets

Arxiv

0+阅读 · 2022年6月17日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Arxiv

13+阅读 · 2019年11月14日

Multi-view Knowledge Graph Embedding for Entity Alignment

Arxiv

36+阅读 · 2019年6月6日

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Arxiv

17+阅读 · 2018年6月5日

Global Relation Embedding for Relation Extraction

Arxiv

10+阅读 · 2018年4月19日

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Arxiv

21+阅读 · 2018年1月16日

Adversarial Learning for Chinese NER from Crowd Annotations

Arxiv

15+阅读 · 2018年1月16日

相关基金

白光LED用无稀土掺杂氮化硼基（BN-）氮（氧）化物荧光材料的研究

国家自然科学基金

0+阅读 · 2015年12月31日

高效脱除水中重金属离子的垂直取向介孔膜的合成与性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

肠浒苔（Enteromorpha intestinalis）多糖的结构和抗肿瘤活性随季节与地域变化规律研究

国家自然科学基金

0+阅读 · 2012年12月31日

最优传输问题与随机矩阵

国家自然科学基金

3+阅读 · 2012年12月31日

F/Cl+CHD3反应中反应物振动激发与取向的立体动力学理论研究

国家自然科学基金

0+阅读 · 2012年12月31日

全国第十六届大环化学暨第八届超分子化学学术讨论会

国家自然科学基金

0+阅读 · 2012年2月28日

瞿麦诱导肿瘤细胞凋亡的活性成分及分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

稀土RE-Fe-Cr三元系相图及其化合物吸波性能研究

国家自然科学基金

0+阅读 · 2009年12月31日

胶西北矿集区蚀变岩型金矿资源潜力评价分形方法研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员