后退检索:无平行公司跨语言文本代表的图像集成评价计量 (Backretrieval: An Image-Pivoted Evaluation Metric for Cross-Lingual Text Representations Without Parallel Corpora) - 专知论文

会员服务 ·

0

INFORMS · Machine Translation · Backbone · Pair · 真实值 ·

2021 年 5 月 11 日

Backretrieval: An Image-Pivoted Evaluation Metric for Cross-Lingual Text Representations Without Parallel Corpora

翻译：后退检索:无平行公司跨语言文本代表的图像集成评价计量

Mikhail Fain,Niall Twomey,Danushka Bollegala

from arxiv, SIGIR 2021

Cross-lingual text representations have gained popularity lately and act as the backbone of many tasks such as unsupervised machine translation and cross-lingual information retrieval, to name a few. However, evaluation of such representations is difficult in the domains beyond standard benchmarks due to the necessity of obtaining domain-specific parallel language data across different pairs of languages. In this paper, we propose an automatic metric for evaluating the quality of cross-lingual textual representations using images as a proxy in a paired image-text evaluation dataset. Experimentally, Backretrieval is shown to highly correlate with ground truth metrics on annotated datasets, and our analysis shows statistically significant improvements over baselines. Our experiments conclude with a case study on a recipe dataset without parallel cross-lingual data. We illustrate how to judge cross-lingual embedding quality with Backretrieval, and validate the outcome with a small human study.

翻译：最近,跨语文文本表述方式受到欢迎,并成为许多任务的主干,例如无人监督的机器翻译和跨语文信息检索等等。然而,由于需要获得不同语文之间特定领域的平行语言数据,在标准基准范围以外的领域很难评价这种表述方式。在本文件中,我们提出一个自动衡量标准,用以评价跨语文文本表述方式的质量,在配对图像文本评价数据集中以图像作为替代。实验性地显示,回溯检索方式与附加说明的数据集中的地面事实指标高度相关,我们的分析显示基线在统计上有很大改进。我们的实验结束时,对没有平行的跨语文数据的食谱数据集进行了案例研究。我们举例说明如何判断跨语文版本质量与回溯检索数据库的连接,并用小型人类研究来验证结果。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

威斯康辛大学《机器学习导论》2020秋季课程完结，课件、视频资源已开放

专知会员服务

16+阅读 · 2020年12月25日

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

专知会员服务

40+阅读 · 2020年4月17日

【ACL2020-Facebook AI】跨语言表示学习，Unsupervised Cross-lingual Representation Learning at Scale

【ACL2020-Facebook AI】跨语言表示学习，Unsupervised Cross-lingual Representation Learning at Scale

专知会员服务

27+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【推荐】用于解缠学习的半监督StyleGAN，Semi-Supervised StyleGAN for Disentanglement Learning

【推荐】用于解缠学习的半监督StyleGAN，Semi-Supervised StyleGAN for Disentanglement Learning

专知会员服务

36+阅读 · 2020年3月13日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【WSDN 2020 论文】一种结构图表示学习框架（A Structural Graph Representation Learning Framework）

【WSDN 2020 论文】一种结构图表示学习框架（A Structural Graph Representation Learning Framework）

专知会员服务

74+阅读 · 2019年11月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

雪球

6+阅读 · 2018年8月19日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Evaluation of Representation Models for Text Classification with AutoML Tools

Arxiv

0+阅读 · 2021年6月24日

Unsupervised Cross-lingual Representation Learning at Scale

Arxiv

5+阅读 · 2019年11月5日

BERTScore: Evaluating Text Generation with BERT

Arxiv

5+阅读 · 2019年4月21日

Deep Metric Transfer for Label Propagation with Limited Annotated Data

Arxiv

3+阅读 · 2018年12月20日

Multi-Task Neural Models for Translating Between Styles Within and Across Languages

Arxiv

4+阅读 · 2018年6月12日

Ermes: Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification

Arxiv

6+阅读 · 2018年6月7日

Metric for Automatic Machine Translation Evaluation based on Universal Sentence Representations

Arxiv

4+阅读 · 2018年5月18日

Simple and Effective Semi-Supervised Question Answering

Arxiv

5+阅读 · 2018年4月2日

Identifying Semantic Divergences in Parallel Text without Annotations

Arxiv

3+阅读 · 2018年3月29日

Word Translation Without Parallel Data

Arxiv

8+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

Machine Translation

相关VIP内容

威斯康辛大学《机器学习导论》2020秋季课程完结，课件、视频资源已开放

专知会员服务

16+阅读 · 2020年12月25日

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

专知会员服务

40+阅读 · 2020年4月17日

【ACL2020-Facebook AI】跨语言表示学习，Unsupervised Cross-lingual Representation Learning at Scale

【ACL2020-Facebook AI】跨语言表示学习，Unsupervised Cross-lingual Representation Learning at Scale

专知会员服务

27+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【推荐】用于解缠学习的半监督StyleGAN，Semi-Supervised StyleGAN for Disentanglement Learning

【推荐】用于解缠学习的半监督StyleGAN，Semi-Supervised StyleGAN for Disentanglement Learning

专知会员服务

36+阅读 · 2020年3月13日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【WSDN 2020 论文】一种结构图表示学习框架（A Structural Graph Representation Learning Framework）

【WSDN 2020 论文】一种结构图表示学习框架（A Structural Graph Representation Learning Framework）

专知会员服务

74+阅读 · 2019年11月20日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

赋能真实世界：基于大语言模型的产业智能体技术、实践与评测综述

军事行动中人工智能系统目标交战的附带损伤评估模型 | 最新文献

【普林斯顿博士论文】面向人本机器人学的安全与学习博弈论融合

美陆军协会（AUSA）2025 年会公布的美国十大武器与防务产品创新

相关资讯

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

已删除

雪球

6+阅读 · 2018年8月19日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

Evaluation of Representation Models for Text Classification with AutoML Tools

Arxiv

0+阅读 · 2021年6月24日

Unsupervised Cross-lingual Representation Learning at Scale

Arxiv

5+阅读 · 2019年11月5日

BERTScore: Evaluating Text Generation with BERT

Arxiv

5+阅读 · 2019年4月21日

Deep Metric Transfer for Label Propagation with Limited Annotated Data

Arxiv

3+阅读 · 2018年12月20日

Multi-Task Neural Models for Translating Between Styles Within and Across Languages

Arxiv

4+阅读 · 2018年6月12日

Ermes: Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification

Arxiv

6+阅读 · 2018年6月7日

Metric for Automatic Machine Translation Evaluation based on Universal Sentence Representations

Arxiv

4+阅读 · 2018年5月18日

Simple and Effective Semi-Supervised Question Answering

Arxiv

5+阅读 · 2018年4月2日

Identifying Semantic Divergences in Parallel Text without Annotations

Arxiv

3+阅读 · 2018年3月29日

Word Translation Without Parallel Data

Arxiv

8+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员