面具对称: 自操作神经字对齐 (Mask-Align: Self-Supervised Neural Word Alignment) - 专知论文

会员服务 ·

0

词元分析器 · MoDELS · 全 · Processing（编程语言） · 无监督 ·

2021 年 5 月 17 日

Mask-Align: Self-Supervised Neural Word Alignment

翻译：面具对称: 自操作神经字对齐

Chi Chen,Maosong Sun,Yang Liu

Word alignment, which aims to align translationally equivalent words between source and target sentences, plays an important role in many natural language processing tasks. Current unsupervised neural alignment methods focus on inducing alignments from neural machine translation models, which does not leverage the full context in the target sequence. In this paper, we propose Mask-Align, a self-supervised word alignment model that takes advantage of the full context on the target side. Our model masks out each target token and predicts it conditioned on both source and the remaining target tokens. This two-step process is based on the assumption that the source token contributing most to recovering the masked target token should be aligned. We also introduce an attention variant called leaky attention, which alleviates the problem of unexpected high cross-attention weights on special tokens such as periods. Experiments on four language pairs show that our model outperforms previous unsupervised neural aligners and obtains new state-of-the-art results.

翻译：Word 匹配旨在将源与目标句的等同字对齐, 在许多自然语言处理任务中起着重要作用。当前不受监督的神经调整方法侧重于导出神经机器翻译模型的对齐, 这在目标序列中无法充分利用整个背景。在本文中, 我们提议使用目标侧整个背景的自我监督的词对齐模式Mask- Align。我们的模型遮盖每个目标符号, 并预测它取决于源和剩余目标符号。这个两步进程基于这样的假设: 最有助于恢复掩码目标符号的源符号应该对齐。我们还引入了一个叫做漏注意力的注意变量, 以缓解特殊符号( 如时段)上意外的高交叉注意权重问题。对四对语言的实验显示, 我们的模型比以往的未经监督的神经对齐功能要强, 并获得新的艺术状态结果。

0

相关内容

词元分析器

词元分析器

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

专知会员服务

23+阅读 · 2020年4月21日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Github：NLP相关代码、书目、论文、博文、算法、项目资源（附链接）

Github：NLP相关代码、书目、论文、博文、算法、项目资源（附链接）

数据派THU

35+阅读 · 2019年7月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Visual Grounding in Video for Unsupervised Word Translation

Visual Grounding in Video for Unsupervised Word Translation

Arxiv

7+阅读 · 2020年3月11日

Unsupervised Domain Adaptation on Reading Comprehension

Arxiv

5+阅读 · 2019年11月13日

Unsupervised Multilingual Word Embeddings

Arxiv

3+阅读 · 2018年8月27日

Learned in Translation: Contextualized Word Vectors

Arxiv

6+阅读 · 2018年6月20日

Unsupervised Neural Machine Translation with Weight Sharing

Arxiv

6+阅读 · 2018年4月24日

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

4+阅读 · 2018年4月20日

Unsupervised Machine Translation Using Monolingual Corpora Only

Arxiv

5+阅读 · 2018年4月13日

Unsupervised Neural Machine Translation

Arxiv

6+阅读 · 2018年2月26日

Word Translation Without Parallel Data

Arxiv

7+阅读 · 2018年1月30日

Mix-and-Match Tuning for Self-Supervised Semantic Segmentation

Arxiv

8+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

词元分析器

Processing（编程语言）

相关VIP内容

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

专知会员服务

23+阅读 · 2020年4月21日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【Google】无监督机器翻译，Unsupervised Machine Translation

【Google】无监督机器翻译，Unsupervised Machine Translation

专知会员服务

36+阅读 · 2020年3月3日

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

【北邮-腾讯AI】自监督学习音视觉说话人认证，Self-supervised learning for audio-visual speaker diarization

专知会员服务

26+阅读 · 2020年2月16日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

Github：NLP相关代码、书目、论文、博文、算法、项目资源（附链接）

Github：NLP相关代码、书目、论文、博文、算法、项目资源（附链接）

数据派THU

35+阅读 · 2019年7月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Visual Grounding in Video for Unsupervised Word Translation

Visual Grounding in Video for Unsupervised Word Translation

Arxiv

7+阅读 · 2020年3月11日

Unsupervised Domain Adaptation on Reading Comprehension

Arxiv

5+阅读 · 2019年11月13日

Unsupervised Multilingual Word Embeddings

Arxiv

3+阅读 · 2018年8月27日

Learned in Translation: Contextualized Word Vectors

Arxiv

6+阅读 · 2018年6月20日

Unsupervised Neural Machine Translation with Weight Sharing

Arxiv

6+阅读 · 2018年4月24日

Phrase-Based & Neural Unsupervised Machine Translation

Arxiv

4+阅读 · 2018年4月20日

Unsupervised Machine Translation Using Monolingual Corpora Only

Arxiv

5+阅读 · 2018年4月13日

Unsupervised Neural Machine Translation

Arxiv

6+阅读 · 2018年2月26日

Word Translation Without Parallel Data

Arxiv

7+阅读 · 2018年1月30日

Mix-and-Match Tuning for Self-Supervised Semantic Segmentation

Arxiv

8+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员