神经机器翻译最低贝系风险代号的抽样近似值 (Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation) - 专知论文

会员服务 ·

0

贝叶斯风险 · 近似 · 极小点 · 束搜索 · 解码 ·

2022 年 10 月 25 日

Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation

翻译：神经机器翻译最低贝系风险代号的抽样近似值

Bryan Eikema,Wilker Aziz

from arxiv, EMNLP 2022 camera-ready

In NMT we search for the mode of the model distribution to form predictions. The mode and other high-probability translations found by beam search have been shown to often be inadequate in a number of ways. This prevents improving translation quality through better search, as these idiosyncratic translations end up selected by the decoding algorithm, a problem known as the beam search curse. Recently, an approximation to minimum Bayes risk (MBR) decoding has been proposed as an alternative decision rule that would likely not suffer from the same problems. We analyse this approximation and establish that it has no equivalent to the beam search curse. We then design approximations that decouple the cost of exploration from the cost of robust estimation of expected utility. This allows for much larger hypothesis spaces, which we show to be beneficial. We also show that mode-seeking strategies can aid in constructing compact sets of promising hypotheses and that MBR is effective in identifying good translations in them. We conduct experiments on three language pairs varying in amounts of resources available: English into and from German, Romanian, and Nepali.

翻译：在NMT中,我们搜索模型分布模式以形成预测。通过波束搜索发现的模式和其他高概率翻译往往在很多方面都不够充分。这妨碍了通过更好的搜索来提高翻译质量,因为这些奇特的翻译最终是由解码算法所选择的,这个问题被称为光束搜索诅咒。最近,提出了一种接近最小贝叶风险(MBR)解码的替代决定规则,该规则可能不会受到同样的问题的影响。我们分析这一近似法,确定它不等同于横梁搜索诅咒。然后我们设计近似法,将勘探成本与对预期效用的可靠估计成本相提并论。这样可以创造更大的假设空间,我们证明这样做是有益的。我们还表明,寻求模式的战略可以帮助构建有希望的假设的契约系列,而且MBR在确定这些假设的正确翻译方面是有效的。我们实验了三种不同的语言:英语对德文、罗马尼亚文和尼泊尔文。

0

相关内容

贝叶斯风险

贝叶斯风险

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【电子书推荐】机器学习导论Introduction to Machine Learning，斯坦福大学 | Nils J. Nilsson

【电子书推荐】机器学习导论Introduction to Machine Learning，斯坦福大学 | Nils J. Nilsson

专知会员服务

46+阅读 · 2019年11月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

概率和平均框架下一系列Sobolev空间中的函数逼近与恢复

国家自然科学基金

1+阅读 · 2015年12月31日

广义欧拉多项式的实根性

国家自然科学基金

0+阅读 · 2015年12月31日

有限半群与半群簇

国家自然科学基金

1+阅读 · 2013年12月31日

具有非均匀导体和环纽结导线的网络对瞬态电磁源响应的拓扑分析与数值逼近

国家自然科学基金

0+阅读 · 2012年12月31日

幂零李群上热核估计的几个问题

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

抑郁症fMRI数据分析方法及辅助诊断治疗模型研究

国家自然科学基金

2+阅读 · 2011年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

1+阅读 · 2011年12月31日

基于HHT的超光谱图像高精度分类算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

空间经济计量模型检验中Bootstrap方法有效性研究

国家自然科学基金

0+阅读 · 2008年12月31日

Increasing the Cost of Model Extraction with Calibrated Proof of Work

Arxiv

0+阅读 · 2022年12月12日

Adversarial Weight Perturbation Improves Generalization in Graph Neural Network

Arxiv

0+阅读 · 2022年12月9日

DC-MBR: Distributional Cooling for Minimum Bayesian Risk Decoding

Arxiv

0+阅读 · 2022年12月8日

Methodological concerns about 'concordance-statistic for benefit' as a measure of discrimination in treatment benefit prediction

Arxiv

0+阅读 · 2022年12月8日

Uniformly Valid Inference Based on the Lasso in Linear Mixed Models

Arxiv

0+阅读 · 2022年12月8日

A Topological Deep Learning Framework for Neural Spike Decoding

Arxiv

0+阅读 · 2022年12月1日

Reinforced Negative Sampling over Knowledge Graph for Recommendation

Arxiv

17+阅读 · 2020年3月12日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

Learning to Propagate for Graph Meta-Learning

Arxiv

14+阅读 · 2019年9月11日

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Arxiv

17+阅读 · 2018年5月31日

VIP会员

文章信息

相关主题

贝叶斯风险

相关VIP内容

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【电子书推荐】机器学习导论Introduction to Machine Learning，斯坦福大学 | Nils J. Nilsson

【电子书推荐】机器学习导论Introduction to Machine Learning，斯坦福大学 | Nils J. Nilsson

专知会员服务

46+阅读 · 2019年11月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

最新《扩散模型原理》新书，470页pdf

无人机作战：演进、创新与未来战场

AI 智能体简史

多模态空间推理在大模型时代：综述与基准测试

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Increasing the Cost of Model Extraction with Calibrated Proof of Work

Arxiv

0+阅读 · 2022年12月12日

Adversarial Weight Perturbation Improves Generalization in Graph Neural Network

Arxiv

0+阅读 · 2022年12月9日

DC-MBR: Distributional Cooling for Minimum Bayesian Risk Decoding

Arxiv

0+阅读 · 2022年12月8日

Methodological concerns about 'concordance-statistic for benefit' as a measure of discrimination in treatment benefit prediction

Arxiv

0+阅读 · 2022年12月8日

Uniformly Valid Inference Based on the Lasso in Linear Mixed Models

Arxiv

0+阅读 · 2022年12月8日

A Topological Deep Learning Framework for Neural Spike Decoding

Arxiv

0+阅读 · 2022年12月1日

Reinforced Negative Sampling over Knowledge Graph for Recommendation

Arxiv

17+阅读 · 2020年3月12日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

Learning to Propagate for Graph Meta-Learning

Arxiv

14+阅读 · 2019年9月11日

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Arxiv

17+阅读 · 2018年5月31日

相关基金

概率和平均框架下一系列Sobolev空间中的函数逼近与恢复

国家自然科学基金

1+阅读 · 2015年12月31日

广义欧拉多项式的实根性

国家自然科学基金

0+阅读 · 2015年12月31日

有限半群与半群簇

国家自然科学基金

1+阅读 · 2013年12月31日

具有非均匀导体和环纽结导线的网络对瞬态电磁源响应的拓扑分析与数值逼近

国家自然科学基金

0+阅读 · 2012年12月31日

幂零李群上热核估计的几个问题

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

抑郁症fMRI数据分析方法及辅助诊断治疗模型研究

国家自然科学基金

2+阅读 · 2011年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

1+阅读 · 2011年12月31日

基于HHT的超光谱图像高精度分类算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

空间经济计量模型检验中Bootstrap方法有效性研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员