N-Gram NGram 近邻近邻机器翻译 (N-Gram Nearest Neighbor Machine Translation) - 专知论文

会员服务 ·

0

近邻 · MoDELS · Machine Translation · N元 · 表示 ·

2023 年 1 月 30 日

N-Gram Nearest Neighbor Machine Translation

翻译：N-Gram NGram 近邻近邻机器翻译

Rui Lv,Junliang Guo,Rui Wang,Xu Tan,Qi Liu,Tao Qin

Nearest neighbor machine translation augments the Autoregressive Translation~(AT) with $k$-nearest-neighbor retrieval, by comparing the similarity between the token-level context representations of the target tokens in the query and the datastore. However, the token-level representation may introduce noise when translating ambiguous words, or fail to provide accurate retrieval results when the representation generated by the model contains indistinguishable context information, e.g., Non-Autoregressive Translation~(NAT) models. In this paper, we propose a novel $n$-gram nearest neighbor retrieval method that is model agnostic and applicable to both AT and NAT models. Specifically, we concatenate the adjacent $n$-gram hidden representations as the key, while the tuple of corresponding target tokens is the value. In inference, we propose tailored decoding algorithms for AT and NAT models respectively. We demonstrate that the proposed method consistently outperforms the token-level method on both AT and NAT models as well on general as on domain adaptation translation tasks. On domain adaptation, the proposed method brings $1.03$ and $2.76$ improvements regarding the average BLEU score on AT and NAT models respectively.

翻译：近距离近距离机器翻译增加了自动递增翻译~(AT), 增加了美元- 美元- 美元- 美元- 美元- 美元- 美元- 近邻检索, 比较了查询和数据储存中目标符号的象征性背景表示和数据储存中的目标符号之间的相似性。但是, 象征性表示在翻译模棱两可的单词时可能会出现噪音, 或者当模型生成的表示包含无法区分的上下文信息时无法提供准确的检索结果, 例如, 非自动递增翻译~ (NAT) 模型。在本文中, 我们提议了一个小说 $- 美元- 克最近的近邻检索方法, 其模型为AT 和 NAT 模型的模拟, 并且适用于AT 和 NAT 模型的普通化, 分别将相邻的 $ 美元- 隐藏表示为关键值, 而相应的目标符号的图示则是价值。我们推断, 我们建议为AT AT 和 NAT 平方平方和平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平方平调。

0

相关内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知

4+阅读 · 2022年10月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

力电耦合下铜基纳米多层膜界面效应和传导性质研究

国家自然科学基金

0+阅读 · 2014年12月31日

复合石墨烯负载纳米双金属催化剂的结构调控及其ORR催化性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

甘草酸二铵调控糖皮质激素受体防治百草枯肺损伤机制：与TLRs信号通路相关？

国家自然科学基金

0+阅读 · 2012年12月31日

SREBP1转录因子在奶牛乳腺MAC-T细胞中对SCD基因启动子的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Nogo/NgR及其下游Rho/ROCK信号通路探讨电针治疗脊髓损伤的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥AMOS1基因介导的铵胁迫信号传导途径研究

国家自然科学基金

0+阅读 · 2012年12月31日

PPARγ和ANGPTL4基因表达在急性胰腺炎肺损伤发病机制中的作用及清胰汤的干预作用

国家自然科学基金

0+阅读 · 2011年12月31日

黑曲霉（Aspergillus niger）对含钾矿物的生物风化与调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

重症急性胰腺炎坏死与凋亡分子XIAP，HSPA5，GSK3B，RIPK1的个体差异性

国家自然科学基金

0+阅读 · 2008年12月31日

LEAPT: Learning Adaptive Prefix-to-prefix Translation For Simultaneous Machine Translation

Arxiv

0+阅读 · 2023年3月21日

Fix the Noise: Disentangling Source Feature for Controllable Domain Translation

Arxiv

0+阅读 · 2023年3月21日

Towards Reliable Neural Machine Translation with Consistency-Aware Meta-Learning

Arxiv

0+阅读 · 2023年3月20日

Enhancing the Role of Context in Region-Word Alignment for Object Detection

Arxiv

0+阅读 · 2023年3月17日

Greedy Ordering of Layer Weight Matrices in Transformers Improves Translation

Arxiv

0+阅读 · 2023年3月17日

MATCH: Metadata-Aware Text Classification in A Large Hierarchy

Arxiv

12+阅读 · 2021年2月15日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Contextual and Position-Aware Factorization Machines for Sentiment Classification

Arxiv

13+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

Machine Translation

相关VIP内容

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知

4+阅读 · 2022年10月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

LEAPT: Learning Adaptive Prefix-to-prefix Translation For Simultaneous Machine Translation

Arxiv

0+阅读 · 2023年3月21日

Fix the Noise: Disentangling Source Feature for Controllable Domain Translation

Arxiv

0+阅读 · 2023年3月21日

Towards Reliable Neural Machine Translation with Consistency-Aware Meta-Learning

Arxiv

0+阅读 · 2023年3月20日

Enhancing the Role of Context in Region-Word Alignment for Object Detection

Arxiv

0+阅读 · 2023年3月17日

Greedy Ordering of Layer Weight Matrices in Transformers Improves Translation

Arxiv

0+阅读 · 2023年3月17日

MATCH: Metadata-Aware Text Classification in A Large Hierarchy

Arxiv

12+阅读 · 2021年2月15日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Arxiv

12+阅读 · 2019年9月26日

Detect-to-Retrieve: Efficient Regional Aggregation for Image Search

Arxiv

15+阅读 · 2018年12月4日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Contextual and Position-Aware Factorization Machines for Sentiment Classification

Arxiv

13+阅读 · 2018年1月18日

相关基金

力电耦合下铜基纳米多层膜界面效应和传导性质研究

国家自然科学基金

0+阅读 · 2014年12月31日

复合石墨烯负载纳米双金属催化剂的结构调控及其ORR催化性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

甘草酸二铵调控糖皮质激素受体防治百草枯肺损伤机制：与TLRs信号通路相关？

国家自然科学基金

0+阅读 · 2012年12月31日

SREBP1转录因子在奶牛乳腺MAC-T细胞中对SCD基因启动子的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Nogo/NgR及其下游Rho/ROCK信号通路探讨电针治疗脊髓损伤的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥AMOS1基因介导的铵胁迫信号传导途径研究

国家自然科学基金

0+阅读 · 2012年12月31日

PPARγ和ANGPTL4基因表达在急性胰腺炎肺损伤发病机制中的作用及清胰汤的干预作用

国家自然科学基金

0+阅读 · 2011年12月31日

黑曲霉（Aspergillus niger）对含钾矿物的生物风化与调控机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Reality-based Interaction用户界面模型和评估方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

重症急性胰腺炎坏死与凋亡分子XIAP，HSPA5，GSK3B，RIPK1的个体差异性

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员