在低资源、病理学和无标志的设置中翻译未见的英文MT? YorPobá $\rightrow $\rightrow$ (Translating the Unseen? Yorùbá $\rightarrow$ English MT in Low-Resource, Morphologically-Unmarked Settings) - 专知论文

会员服务 ·

0

双向LSTM · MoDELS · Performer · NMT · 可辨认的 ·

2021 年 3 月 7 日

Translating the Unseen? Yorùbá $\rightarrow$ English MT in Low-Resource, Morphologically-Unmarked Settings

翻译：在低资源、病理学和无标志的设置中翻译未见的英文MT? YorPobá $\rightrow $\rightrow$

Ife Adebara Miikka Silfverberg Muhammad Abdul-Mageed

Translating between languages where certain features are marked morphologically in one but absent or marked contextually in the other is an important test case for machine translation. When translating into English which marks (in)definiteness morphologically, from Yor\`ub\'a which uses bare nouns but marks these features contextually, ambiguities arise. In this work, we perform fine-grained analysis on how an SMT system compares with two NMT systems (BiLSTM and Transformer) when translating bare nouns in Yor\`ub\'a into English. We investigate how the systems what extent they identify BNs, correctly translate them, and compare with human translation patterns. We also analyze the type of errors each model makes and provide a linguistic description of these errors. We glean insights for evaluating model performance in low-resource settings. In translating bare nouns, our results show the transformer model outperforms the SMT and BiLSTM models for 4 categories, the BiLSTM outperforms the SMT model for 3 categories while the SMT outperforms the NMT models for 1 category.

翻译：将某些特征在一种语言中以形态标记,但在另一种语言中则没有或根据背景标记,这是机器翻译的一个重要测试案例。在将使用光名词但根据背景标记这些特征的Yor ⁇ ub\'a从使用光名词但根据背景标记这些特征的Yor ⁇ uub\'a译成英文时,将某些特征以形态标记为一种语言翻译为一种语言,但在另一种语言中则没有或根据背景标记为一种语言。在这项工作中,我们对将Yor ⁇ ub\'a的光名词转换为英语时,SMT系统与两个NMT系统(BILSTM和变异器)相比如何进行细微分析。我们调查了这些系统如何辨别出BNN、正确翻译和与人类翻译模式进行比较。我们还分析了每个模型的错误类型,并提供了这些错误的语言描述。我们在低资源环境中评估模型性能时,我们收集了洞察。在翻译光名词时,我们的结果表明变式模型比4类的SMT和BILSTM模型高出3类模型的模型。

0

相关内容

双向LSTM

BiLSTM是Bi-directional Long Short-Term Memory的缩写，是由前向LSTM与后向LSTM组合而成。在自然语言处理任务中都常被用来建模上下文信息。

【Manning新书】C++并行实战，592页pdf，C++ Concurrency in Action

【Manning新书】C++并行实战，592页pdf，C++ Concurrency in Action

专知会员服务

63+阅读 · 2021年1月16日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

Effective.Modern.C++ 中英文版，334页pdf

Effective.Modern.C++ 中英文版，334页pdf

专知会员服务

68+阅读 · 2020年11月4日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日

【CMU-TACL2020】低资源跨语言实体链接，Low-resource Crosslingual EntityLinking

专知会员服务

17+阅读 · 2020年3月29日

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

专知会员服务

274+阅读 · 2020年2月13日

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

专知会员服务

7+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

迁移学习之Domain Adaptation

迁移学习之Domain Adaptation

全球人工智能

18+阅读 · 2018年4月11日

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

专知

6+阅读 · 2018年1月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

深度学习NLP相关资源大列表

深度学习NLP相关资源大列表

机器学习研究会

3+阅读 · 2017年9月17日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

Family of Origin and Family of Choice: Massively Parallel Lexiconized Iterative Pretraining for Severely Low Resource Machine Translation

Arxiv

0+阅读 · 2021年4月28日

Phrase-Based & Neural Unsupervised Machine Translation

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

9+阅读 · 2018年8月13日

Doubly Attentive Transformer Machine Translation

Doubly Attentive Transformer Machine Translation

Arxiv

4+阅读 · 2018年7月30日

When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?

Arxiv

3+阅读 · 2018年4月18日

Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation

Arxiv

3+阅读 · 2018年4月17日

Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task

Arxiv

3+阅读 · 2018年4月16日

Unsupervised Machine Translation Using Monolingual Corpora Only

Arxiv

5+阅读 · 2018年4月13日

Word Translation Without Parallel Data

Arxiv

7+阅读 · 2018年1月30日

Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks

Arxiv

3+阅读 · 2018年1月23日

Improved English to Russian Translation by Neural Suffix Prediction

Arxiv

4+阅读 · 2018年1月11日

VIP会员

文章信息

相关主题

相关VIP内容

【Manning新书】C++并行实战，592页pdf，C++ Concurrency in Action

【Manning新书】C++并行实战，592页pdf，C++ Concurrency in Action

专知会员服务

63+阅读 · 2021年1月16日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

Effective.Modern.C++ 中英文版，334页pdf

Effective.Modern.C++ 中英文版，334页pdf

专知会员服务

68+阅读 · 2020年11月4日

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

【领域对抗学习的低资源文本分类】Low-Resource Text Classification using Domain-Adversarial Learning

专知会员服务

23+阅读 · 2020年4月22日

【CMU-TACL2020】低资源跨语言实体链接，Low-resource Crosslingual EntityLinking

专知会员服务

17+阅读 · 2020年3月29日

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

《C++ Primer中文版第5版》电子书与学习笔记和课后练习答案

专知会员服务

274+阅读 · 2020年2月13日

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

【综述】文献级机器翻译研究:方法与评价（A Survey on Document-level Machine Translation: Methods and Evaluation）

专知会员服务

7+阅读 · 2019年12月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

迁移学习之Domain Adaptation

迁移学习之Domain Adaptation

全球人工智能

18+阅读 · 2018年4月11日

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

【论文推荐】最新6篇机器翻译相关论文—词性和语义标注任务、变分递归神经机器翻译、文学语料、神经后缀预测、重构模型

专知

6+阅读 · 2018年1月25日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

深度学习NLP相关资源大列表

深度学习NLP相关资源大列表

机器学习研究会

3+阅读 · 2017年9月17日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

相关论文

Family of Origin and Family of Choice: Massively Parallel Lexiconized Iterative Pretraining for Severely Low Resource Machine Translation

Arxiv

0+阅读 · 2021年4月28日

Phrase-Based & Neural Unsupervised Machine Translation

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

9+阅读 · 2018年8月13日

Doubly Attentive Transformer Machine Translation

Doubly Attentive Transformer Machine Translation

Arxiv

4+阅读 · 2018年7月30日

When and Why are Pre-trained Word Embeddings Useful for Neural Machine Translation?

Arxiv

3+阅读 · 2018年4月18日

Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation

Arxiv

3+阅读 · 2018年4月17日

Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task

Arxiv

3+阅读 · 2018年4月16日

Unsupervised Machine Translation Using Monolingual Corpora Only

Arxiv

5+阅读 · 2018年4月13日

Word Translation Without Parallel Data

Arxiv

7+阅读 · 2018年1月30日

Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks

Arxiv

3+阅读 · 2018年1月23日

Improved English to Russian Translation by Neural Suffix Prediction

Arxiv

4+阅读 · 2018年1月11日

微信扫码咨询专知VIP会员