审查孟加拉自然语言处理任务和变换模型的效用 (A Review of Bangla Natural Language Processing Tasks and the Utility of Transformer Models) - 专知论文

会员服务 ·

0

NLP · MoDELS · Processing（编程语言） · Transformer模型 · 自然语言处理 ·

2021 年 7 月 11 日

A Review of Bangla Natural Language Processing Tasks and the Utility of Transformer Models

翻译：审查孟加拉自然语言处理任务和变换模型的效用

Firoj Alam,Arid Hasan,Tanvirul Alam,Akib Khan,Janntatul Tajrin,Naira Khan,Shammur Absar Chowdhury

from arxiv, Under Review, Bangla language processing, text classification, sequence tagging, datasets, benchmarks, transformer models

Bangla -- ranked as the 6th most widely spoken language across the world (https://www.ethnologue.com/guides/ethnologue200), with 230 million native speakers -- is still considered as a low-resource language in the natural language processing (NLP) community. With three decades of research, Bangla NLP (BNLP) is still lagging behind mainly due to the scarcity of resources and the challenges that come with it. There is sparse work in different areas of BNLP; however, a thorough survey reporting previous work and recent advances is yet to be done. In this study, we first provide a review of Bangla NLP tasks, resources, and tools available to the research community; we benchmark datasets collected from various platforms for nine NLP tasks using current state-of-the-art algorithms (i.e., transformer-based models). We provide comparative results for the studied NLP tasks by comparing monolingual vs. multilingual models of varying sizes. We report our results using both individual and consolidated datasets and provide data splits for future research. We reviewed a total of 108 papers and conducted 175 sets of experiments. Our results show promising performance using transformer-based models while highlighting the trade-off with computational costs. We hope that such a comprehensive survey will motivate the community to build on and further advance the research on Bangla NLP.

翻译：孟加拉语(Bangla NLP)是全世界使用量最广的第6种语言(https://www.ethnologue.com/guides/ethnologue200),有2.3亿土著发言者,这在自然语言处理(NLP)社区中仍被视为一种低资源语言。由于进行了30年的研究,Bangla NLP(BNLP)仍然落后于30年,这主要是由于资源稀缺和随之而来的挑战。在BNLP的不同领域,工作稀少;然而,报告以往工作和最近进展的彻底调查尚未完成。在本研究中,我们首先审查Bangla NLP的任务、资源和工具;我们利用目前最先进的算法(即变压模型),为从各种平台收集的九种NLP任务(即变压模型)的数据集进行基准化。我们通过比较单语版和不同规模的多语言模型,为所研究的NLP任务提供了比较结果。我们利用个人和综合数据集报告我们的成果,并为未来研究提供数据分解的数据。我们用108个模型和175个有希望的模型来显示我们进行的全面改革的计算结果。

0

相关内容

NLP

NLP:自然语言处理

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

139+阅读 · 2020年7月10日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

自然语言处理顶会EMNLP2018接受论文列表！

自然语言处理顶会EMNLP2018接受论文列表！

专知

87+阅读 · 2018年8月26日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

On the validity of pre-trained transformers for natural language processing in the software engineering domain

Arxiv

0+阅读 · 2021年9月10日

A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions

Arxiv

0+阅读 · 2021年9月9日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Graph Neural Networks for Natural Language Processing: A Survey

Arxiv

36+阅读 · 2021年6月10日

A Survey of Transformers

Arxiv

103+阅读 · 2021年6月8日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

An Attentive Survey of Attention Models

An Attentive Survey of Attention Models

Arxiv

44+阅读 · 2020年12月15日

A Survey of the State of Explainable AI for Natural Language Processing

Arxiv

26+阅读 · 2020年10月1日

Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

Arxiv

21+阅读 · 2019年2月4日

Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

Arxiv

5+阅读 · 2017年12月12日

VIP会员

文章信息

相关主题

Processing（编程语言）

Transformer模型

自然语言处理

相关VIP内容

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

139+阅读 · 2020年7月10日

自然语言处理中的注意力机制，Attention in Natural Language Processing

自然语言处理中的注意力机制，Attention in Natural Language Processing

专知会员服务

136+阅读 · 2020年5月30日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

自然语言处理顶会EMNLP2018接受论文列表！

自然语言处理顶会EMNLP2018接受论文列表！

专知

87+阅读 · 2018年8月26日

推荐｜深度强化学习聊天机器人（附论文）！

推荐｜深度强化学习聊天机器人（附论文）！

全球人工智能

4+阅读 · 2018年1月30日

相关论文

On the validity of pre-trained transformers for natural language processing in the software engineering domain

Arxiv

0+阅读 · 2021年9月10日

A Survey of Deep Reinforcement Learning in Recommender Systems: A Systematic Review and Future Directions

Arxiv

0+阅读 · 2021年9月9日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Arxiv

23+阅读 · 2021年8月12日

Graph Neural Networks for Natural Language Processing: A Survey

Arxiv

36+阅读 · 2021年6月10日

A Survey of Transformers

Arxiv

103+阅读 · 2021年6月8日

A Survey on Visual Transformer

Arxiv

19+阅读 · 2020年12月23日

An Attentive Survey of Attention Models

An Attentive Survey of Attention Models

Arxiv

44+阅读 · 2020年12月15日

A Survey of the State of Explainable AI for Natural Language Processing

Arxiv

26+阅读 · 2020年10月1日

Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

Arxiv

21+阅读 · 2019年2月4日

Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

Arxiv

5+阅读 · 2017年12月12日

微信扫码咨询专知VIP会员