低资源语言翻译质量估算 (Mismatching-Aware Unsupervised Translation Quality Estimation For Low-Resource Languages) - 专知论文

会员服务 ·

0

估计/估计量 · 无监督 · 未知词元 · 机器翻译 · Attention ·

2022 年 7 月 31 日

Mismatching-Aware Unsupervised Translation Quality Estimation For Low-Resource Languages

翻译：低资源语言翻译质量估算

Fatemeh Azadi,Heshaam Faili,Mohammad Javad Dousti

from arxiv, Submitted to Language Resources and Evaluation

Translation Quality Estimation (QE) is the task of predicting the quality of machine translation (MT) output without any reference. This task has gained increasing attention as an important component in practical applications of MT. In this paper, we first propose XLMRScore, a simple unsupervised QE method based on the BERTScore computed using the XLM-RoBERTa (XLMR) model while discussing the issues that occur using this method. Next, we suggest two approaches to mitigate the issues: replacing untranslated words with the unknown token and the cross-lingual alignment of pre-trained model to represent aligned words closer to each other. We evaluate the proposed method on four low-resource language pairs of WMT21 QE shared task, as well as a new English-Farsi test dataset introduced in this paper. Experiments show that our method could get comparable results with the supervised baseline for two zero-shot scenarios, i.e., with less than 0.01 difference in Pearson correlation, while outperforming the unsupervised rivals in all the low-resource language pairs for above 8% in average.

翻译：翻译质量估计( QE) 是预测机器翻译( MT) 输出质量而无需参考的任务。这项任务作为MT 实际应用中的一个重要部分, 日益引起人们的关注。在本文中, 我们首先提出 XLMRScore, 这是一种简单且不受监督的 QE 方法, 其依据是使用 XLM- ROBERTA (XLMR) 模型计算的 BERTScore 计算的简单 QE 方法, 并同时讨论使用此方法发生的问题。其次, 我们建议了两种办法来缓解问题: 替换未翻译的单词, 代之以未知的符号, 以及将预培训模式的跨语种对齐, 以代表彼此相近的单词。我们评估了 WMT21 QE 共享的四对低资源语言的拟议方法, 以及本文中引入的一个新的英法西测试数据集。实验显示, 我们的方法可以与两种零度假设的基线相比, 不到0.01 差异,,, 皮尔森, 的对比差差差,, 超过 8 % 。

0

相关内容

估计/估计量

估计/估计量

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

活性氧介导的内质网应激在博莱霉素诱发肺上皮-间质转化和肺纤维化中的作用

国家自然科学基金

0+阅读 · 2016年12月31日

A1AR保护糖尿病肾小管周微环境的非管球反馈机制

国家自然科学基金

0+阅读 · 2014年12月31日

近红外多吡咯光敏剂的合成与性质研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

自组装太阳能电池（二）

国家自然科学基金

0+阅读 · 2012年12月31日

应用AT2受体阳性骨髓干细胞亚群有效改善心肌修复

国家自然科学基金

0+阅读 · 2012年12月31日

非线性椭圆型偏微分方程的边界正则性

国家自然科学基金

0+阅读 · 2012年12月31日

广西茉莉花茎、叶黄酮类化合物抗氧化活性及其构效关系和作用机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

区域洪水资源利用竞争性评价及合理配置研究

国家自然科学基金

0+阅读 · 2011年12月31日

利用定量构效关系模型研究抗氧化肽构效关系

国家自然科学基金

0+阅读 · 2009年12月31日

Compositional Semantic Parsing with Large Language Models

Arxiv

0+阅读 · 2022年9月29日

Distribution Aware Metrics for Conditional Natural Language Generation

Arxiv

0+阅读 · 2022年9月29日

Improved estimates for the number of non-negative integer matrices with given row and column sums

Arxiv

0+阅读 · 2022年9月29日

Rethinking Clustering-Based Pseudo-Labeling for Unsupervised Meta-Learning

Arxiv

0+阅读 · 2022年9月27日

EditEval: An Instruction-Based Benchmark for Text Improvements

EditEval: An Instruction-Based Benchmark for Text Improvements

Arxiv

0+阅读 · 2022年9月27日

Integrated multimodal artificial intelligence framework for healthcare applications

Arxiv

0+阅读 · 2022年9月26日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Arxiv

10+阅读 · 2021年2月11日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Arxiv

13+阅读 · 2019年11月14日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机集群配置对模拟作战环境任务效能的影响研究》最新50页

《俄罗斯作战模式解析：对俄特别军事行动的观察报告》最新325页

军用无人机集群技术尚未成熟——但潜力可期

《无人机改变战争规则，但无法破解陆战固有挑战》最新报告

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

相关论文

Compositional Semantic Parsing with Large Language Models

Arxiv

0+阅读 · 2022年9月29日

Distribution Aware Metrics for Conditional Natural Language Generation

Arxiv

0+阅读 · 2022年9月29日

Improved estimates for the number of non-negative integer matrices with given row and column sums

Arxiv

0+阅读 · 2022年9月29日

Rethinking Clustering-Based Pseudo-Labeling for Unsupervised Meta-Learning

Arxiv

0+阅读 · 2022年9月27日

EditEval: An Instruction-Based Benchmark for Text Improvements

EditEval: An Instruction-Based Benchmark for Text Improvements

Arxiv

0+阅读 · 2022年9月27日

Integrated multimodal artificial intelligence framework for healthcare applications

Arxiv

0+阅读 · 2022年9月26日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Arxiv

10+阅读 · 2021年2月11日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Arxiv

13+阅读 · 2019年11月14日

相关基金

活性氧介导的内质网应激在博莱霉素诱发肺上皮-间质转化和肺纤维化中的作用

国家自然科学基金

0+阅读 · 2016年12月31日

A1AR保护糖尿病肾小管周微环境的非管球反馈机制

国家自然科学基金

0+阅读 · 2014年12月31日

近红外多吡咯光敏剂的合成与性质研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

自组装太阳能电池（二）

国家自然科学基金

0+阅读 · 2012年12月31日

应用AT2受体阳性骨髓干细胞亚群有效改善心肌修复

国家自然科学基金

0+阅读 · 2012年12月31日

非线性椭圆型偏微分方程的边界正则性

国家自然科学基金

0+阅读 · 2012年12月31日

广西茉莉花茎、叶黄酮类化合物抗氧化活性及其构效关系和作用机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

区域洪水资源利用竞争性评价及合理配置研究

国家自然科学基金

0+阅读 · 2011年12月31日

利用定量构效关系模型研究抗氧化肽构效关系

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员