MphayaNER：Tshivenda 命名实体识别 (MphayaNER: Named Entity Recognition for Tshivenda) - 专知论文

会员服务 ·

0

命名实体 · 实体抽取 · 命名实体识别 · 基线 · 实体 ·

2023 年 4 月 8 日

MphayaNER: Named Entity Recognition for Tshivenda

翻译：MphayaNER：Tshivenda 命名实体识别

Rendani Mbuvha,David I. Adelani,Tendani Mutavhatsindi,Tshimangadzo Rakhuhu,Aluwani Mauda,Tshifhiwa Joshua Maumela,Andisani Masindi,Seani Rananga,Vukosi Marivate,Tshilidzi Marwala

from arxiv, Accepted at AfricaNLP Workshop at ICLR 2023

Named Entity Recognition (NER) plays a vital role in various Natural Language Processing tasks such as information retrieval, text classification, and question answering. However, NER can be challenging, especially in low-resource languages with limited annotated datasets and tools. This paper adds to the effort of addressing these challenges by introducing MphayaNER, the first Tshivenda NER corpus in the news domain. We establish NER baselines by \textit{fine-tuning} state-of-the-art models on MphayaNER. The study also explores zero-shot transfer between Tshivenda and other related Bantu languages, with chiShona and Kiswahili showing the best results. Augmenting MphayaNER with chiShona data was also found to improve model performance significantly. Both MphayaNER and the baseline models are made publicly available.

翻译：命名实体识别(NER)在自然语言处理的各种任务中都发挥着重要的作用，如信息检索、文本分类和问答。然而，在标注数据集和工具有限的低资源语言中，NER可能具有挑战性。本文通过在新闻领域介绍MphayaNER，引入了第一个Tshivenda NER语料库，继续努力解决这些挑战。我们在MphayaNER上通过\改进模型的\微调，建立了NER基线模型。本研究还探讨了Tshivenda和其他相关班图语言之间的零-shot转移，其中chiShona和Kiswahili显示了最佳的结果。还发现，将MphayaNER与chiShona数据相结合，可以显著提高模型性能。MphayaNER和基线模型都是公开可用的。

0

相关内容

命名实体

最新《图神经网络知识图谱补全综述论文》A Survey on Graph Neural Networks for Knowledge Graph Completion

最新《图神经网络知识图谱补全综述论文》A Survey on Graph Neural Networks for Knowledge Graph Completion

专知会员服务

137+阅读 · 2020年7月29日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

139+阅读 · 2020年7月10日

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

专知会员服务

61+阅读 · 2020年5月15日

【微软-ACL2020】TinyMBERT: Multi-Stage Distillation Framework for Massive Multi-lingual NER

【微软-ACL2020】TinyMBERT: Multi-Stage Distillation Framework for Massive Multi-lingual NER

专知会员服务

36+阅读 · 2020年4月14日

【ACL2020-Facebook AI】跨语言表示学习，Unsupervised Cross-lingual Representation Learning at Scale

【ACL2020-Facebook AI】跨语言表示学习，Unsupervised Cross-lingual Representation Learning at Scale

专知会员服务

27+阅读 · 2020年4月5日

【ACL2020-Facebook AI】大规模无监督跨语言表示学习

【ACL2020-Facebook AI】大规模无监督跨语言表示学习

专知会员服务

34+阅读 · 2020年4月5日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

论文浅尝 | Continual Learning for Named Entity Recognition

论文浅尝 | Continual Learning for Named Entity Recognition

开放知识图谱

1+阅读 · 2022年6月25日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

一文读懂命名实体识别

一文读懂命名实体识别

AINLP

31+阅读 · 2019年4月23日

NLP - 基于 BERT 的中文命名实体识别（NER)

NLP - 基于 BERT 的中文命名实体识别（NER)

AINLP

466+阅读 · 2019年2月10日

命名实体识别（NER）综述

命名实体识别（NER）综述

AI研习社

66+阅读 · 2019年1月30日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

TLRs/mROS信号通路在宿主抗乳房链球菌感染中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

慢病毒介导miR-210修饰内皮祖细胞治疗脑缺血的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

细胞自噬在伪狂犬病毒复制感染中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

等离子体中原子光复合过程的相对论R矩阵理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

异构信息空间中时间感知的个性化语义实体搜索关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

ICF中电子/离子输运的PIC-FLUID混合模拟方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Linked Open Data的Web服务语义互操作关键技术

国家自然科学基金

0+阅读 · 2012年12月31日

并行数据和调查数据质量管理

国家自然科学基金

0+阅读 · 2011年12月31日

硅光子学集成用Er silicate光波导放大器应用基础研究

国家自然科学基金

0+阅读 · 2009年12月31日

量子点敏化半导体纳晶薄膜的光电化学动力学过程

国家自然科学基金

0+阅读 · 2008年12月31日

Automatic Creation of Named Entity Recognition Datasets by Querying Phrase Representations

Arxiv

0+阅读 · 2023年5月26日

UFO: Unified Fact Obtaining for Commonsense Question Answering

Arxiv

0+阅读 · 2023年5月25日

Automatic Readability Assessment for Closely Related Languages

Arxiv

0+阅读 · 2023年5月25日

Cognitive-Driven Development Helps Software Teams to Keep Code Units Under the Limit!

Arxiv

0+阅读 · 2023年5月24日

Reconstructive Neuron Pruning for Backdoor Defense

Arxiv

0+阅读 · 2023年5月24日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

Which Knowledge Graph Is Best for Me?

Arxiv

11+阅读 · 2018年9月28日

Graph Convolutional Networks for Named Entity Recognition

Arxiv

17+阅读 · 2018年2月14日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

VIP会员

文章信息

相关主题

命名实体识别

相关VIP内容

最新《图神经网络知识图谱补全综述论文》A Survey on Graph Neural Networks for Knowledge Graph Completion

最新《图神经网络知识图谱补全综述论文》A Survey on Graph Neural Networks for Knowledge Graph Completion

专知会员服务

137+阅读 · 2020年7月29日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

139+阅读 · 2020年7月10日

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

【ACL2020】命名实体识别即依存解析，Named Entity Recognition as Dependency Parsing

专知会员服务

61+阅读 · 2020年5月15日

【微软-ACL2020】TinyMBERT: Multi-Stage Distillation Framework for Massive Multi-lingual NER

【微软-ACL2020】TinyMBERT: Multi-Stage Distillation Framework for Massive Multi-lingual NER

专知会员服务

36+阅读 · 2020年4月14日

【ACL2020-Facebook AI】跨语言表示学习，Unsupervised Cross-lingual Representation Learning at Scale

【ACL2020-Facebook AI】跨语言表示学习，Unsupervised Cross-lingual Representation Learning at Scale

专知会员服务

27+阅读 · 2020年4月5日

【ACL2020-Facebook AI】大规模无监督跨语言表示学习

【ACL2020-Facebook AI】大规模无监督跨语言表示学习

专知会员服务

34+阅读 · 2020年4月5日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

【中科院自动化所】序列到序列语音识别的无监督预训练（Unsupervised pre-training for sequence to sequence speech recognition）

专知会员服务

33+阅读 · 2020年1月5日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

论文浅尝 | Continual Learning for Named Entity Recognition

论文浅尝 | Continual Learning for Named Entity Recognition

开放知识图谱

1+阅读 · 2022年6月25日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

【ACL2020放榜!】事件抽取、关系抽取、NER、Few-Shot 相关论文整理

深度学习自然语言处理

18+阅读 · 2020年5月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

一文读懂命名实体识别

一文读懂命名实体识别

AINLP

31+阅读 · 2019年4月23日

NLP - 基于 BERT 的中文命名实体识别（NER)

NLP - 基于 BERT 的中文命名实体识别（NER)

AINLP

466+阅读 · 2019年2月10日

命名实体识别（NER）综述

命名实体识别（NER）综述

AI研习社

66+阅读 · 2019年1月30日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

【论文推荐】最新五篇命名实体识别（NER）相关论文—对抗学习、语料库、深度多任务学习、先验知识、跨语言语义

专知

37+阅读 · 2018年2月21日

相关论文

Automatic Creation of Named Entity Recognition Datasets by Querying Phrase Representations

Arxiv

0+阅读 · 2023年5月26日

UFO: Unified Fact Obtaining for Commonsense Question Answering

Arxiv

0+阅读 · 2023年5月25日

Automatic Readability Assessment for Closely Related Languages

Arxiv

0+阅读 · 2023年5月25日

Cognitive-Driven Development Helps Software Teams to Keep Code Units Under the Limit!

Arxiv

0+阅读 · 2023年5月24日

Reconstructive Neuron Pruning for Backdoor Defense

Arxiv

0+阅读 · 2023年5月24日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

73+阅读 · 2018年12月22日

Which Knowledge Graph Is Best for Me?

Arxiv

11+阅读 · 2018年9月28日

Graph Convolutional Networks for Named Entity Recognition

Arxiv

17+阅读 · 2018年2月14日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

相关基金

TLRs/mROS信号通路在宿主抗乳房链球菌感染中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

慢病毒介导miR-210修饰内皮祖细胞治疗脑缺血的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

细胞自噬在伪狂犬病毒复制感染中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

等离子体中原子光复合过程的相对论R矩阵理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

异构信息空间中时间感知的个性化语义实体搜索关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

ICF中电子/离子输运的PIC-FLUID混合模拟方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Linked Open Data的Web服务语义互操作关键技术

国家自然科学基金

0+阅读 · 2012年12月31日

并行数据和调查数据质量管理

国家自然科学基金

0+阅读 · 2011年12月31日

硅光子学集成用Er silicate光波导放大器应用基础研究

国家自然科学基金

0+阅读 · 2009年12月31日

量子点敏化半导体纳晶薄膜的光电化学动力学过程

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员