NMTScore:对基于翻译的文本相似性措施的多语言分析 (NMTScore: A Multilingual Analysis of Translation-based Text Similarity Measures) - 专知论文

会员服务 ·

0

相似度度量 · 相似度 · 多语言神经机器翻译 · Pivotal（公司） · NMT ·

2022 年 4 月 28 日

NMTScore: A Multilingual Analysis of Translation-based Text Similarity Measures

翻译：NMTScore:对基于翻译的文本相似性措施的多语言分析

Jannis Vamvas,Rico Sennrich

Being able to rank the similarity of short text segments is an interesting bonus feature of neural machine translation. Translation-based similarity measures include direct and pivot translation probability, as well as translation cross-likelihood, which has not been studied so far. We analyze these measures in the common framework of multilingual NMT, releasing the NMTScore library (available at https://github.com/ZurichNLP/nmtscore). Compared to baselines such as sentence embeddings, translation-based measures prove competitive in paraphrase identification and are more robust against adversarial or multilingual input, especially if proper normalization is applied. When used for reference-based evaluation of data-to-text generation in 2 tasks and 17 languages, translation-based measures show a relatively high correlation to human judgments.

翻译：能够对短文本段的相似性进行排序是神经机器翻译的一个令人感兴趣的奖励性特征。基于翻译的类似性措施包括直接和主轴翻译概率,以及翻译跨类似性,迄今为止尚未对此进行过研究。我们在多语种NMT的共同框架内分析这些措施,释放NMTScore图书馆(见https://github.com/ZlexinNLP/nmtscore)。与诸如嵌入句子等基线相比,基于翻译的措施在语音识别方面证明具有竞争力,而且对于对抗性或多语种输入更为有力,特别是如果应用适当的常规化。当用于对2项任务和17种语言的数据-文字生成进行基于参考的评估时,基于翻译的措施与人类判断的相关性相对较高。

0

相关内容

相似度度量

相似度度量

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

土壤锑砷复合污染对微生物的生态效应及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

BAG3与MACC1相互作用在甲状腺癌细胞上皮间质转化(EMT) 及侵袭中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

碳源胁迫对颗粒污泥稳定性及除磷特性的影响及机制

国家自然科学基金

0+阅读 · 2013年12月31日

外源有机物在稻田土壤中分解转化与甲烷排放关联性研究

国家自然科学基金

0+阅读 · 2013年12月31日

土壤和植物样品中碘形态分析方法及稳定性研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

好氧颗粒污泥的力学解析及结构稳定性响应机制

国家自然科学基金

0+阅读 · 2012年12月31日

根系有机酸分泌物对茶树根土界面微域氟有效性的影响机理

国家自然科学基金

0+阅读 · 2012年12月31日

土壤氨氧化微生物对重金属胁迫的响应机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

白浆型人参土壤中铝的形态转化及对人参的影响

国家自然科学基金

0+阅读 · 2009年12月31日

Contrasting random and learned features in deep Bayesian linear regression

Arxiv

0+阅读 · 2022年6月16日

Inherent Inconsistencies of Feature Importance

Arxiv

0+阅读 · 2022年6月16日

Text normalization for endangered languages: the case of Ligurian

Arxiv

0+阅读 · 2022年6月16日

A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions

Arxiv

43+阅读 · 2022年6月15日

Contextualization and Generalization in Entity and Relation Extraction

Arxiv

0+阅读 · 2022年6月15日

Double Robustness for Complier Parameters and a Semiparametric Test for Complier Characteristics

Arxiv

0+阅读 · 2022年6月15日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models

Arxiv

13+阅读 · 2021年3月8日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

VIP会员

文章信息

相关主题

相似度度量

多语言神经机器翻译

Pivotal（公司）

相关VIP内容

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

69+阅读 · 2020年1月2日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

多模态大语言模型下游调优中“保持自我”的重要性

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec 精选：基于LSTM的序列推荐实现（PyTorch）

LibRec智能推荐

50+阅读 · 2018年8月27日

相关论文

Contrasting random and learned features in deep Bayesian linear regression

Arxiv

0+阅读 · 2022年6月16日

Inherent Inconsistencies of Feature Importance

Arxiv

0+阅读 · 2022年6月16日

Text normalization for endangered languages: the case of Ligurian

Arxiv

0+阅读 · 2022年6月16日

A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions

Arxiv

43+阅读 · 2022年6月15日

Contextualization and Generalization in Entity and Relation Extraction

Arxiv

0+阅读 · 2022年6月15日

Double Robustness for Complier Parameters and a Semiparametric Test for Complier Characteristics

Arxiv

0+阅读 · 2022年6月15日

The Principles of Deep Learning Theory

Arxiv

66+阅读 · 2021年6月18日

Pre-Trained Models: Past, Present and Future

Arxiv

19+阅读 · 2021年6月15日

Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models

Arxiv

13+阅读 · 2021年3月8日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

相关基金

土壤锑砷复合污染对微生物的生态效应及分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

BAG3与MACC1相互作用在甲状腺癌细胞上皮间质转化(EMT) 及侵袭中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

碳源胁迫对颗粒污泥稳定性及除磷特性的影响及机制

国家自然科学基金

0+阅读 · 2013年12月31日

外源有机物在稻田土壤中分解转化与甲烷排放关联性研究

国家自然科学基金

0+阅读 · 2013年12月31日

土壤和植物样品中碘形态分析方法及稳定性研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

好氧颗粒污泥的力学解析及结构稳定性响应机制

国家自然科学基金

0+阅读 · 2012年12月31日

根系有机酸分泌物对茶树根土界面微域氟有效性的影响机理

国家自然科学基金

0+阅读 · 2012年12月31日

土壤氨氧化微生物对重金属胁迫的响应机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

白浆型人参土壤中铝的形态转化及对人参的影响

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员