清华大学NLP组整理的机器翻译论文阅读清单

2018 年 12 月 29 日 AINLP

作为机器翻译从业者，必须推荐一下这份清单，很细致，很全的工作，以下信息来自微博，作者 minicheshire_yang

2018年即将结束，在18年的最后一个工作日，清华大学自然语言处理组整理了一份机器翻译论文的阅读清单想给大家看。这份阅读清单里回顾了统计机器翻译（SMT）时代的亮点论文，并概括了近期神经机器翻译（NMT）方向下的各个子领域，其中包括：模型架构、注意力机制、开放词表问题与字符级别神经机器翻译、训练准则与框架、解码机制、低资源语言翻译、多语种机器翻译、先验知识融合、文档级别机器翻译、机器翻译中的鲁棒性、可视化与可解释性、公正性与多样性、机器翻译效率问题、语音翻译与同传翻译、多模态翻译、预训练方法、领域适配问题、质量估计、自动后处理、推导双语词典以及诗歌翻译。欢迎大家参考并多提意见！清单链接：https://github.com/THUNLP-MT/MT-Reading-List

点击文末“阅读原文”可直达github链接，以下节选至统计机器翻译以来的十大必读论文：

Machine Translation Reading List

This is a machine translation reading list maintained by the Tsinghua Natural Language Processing Group.

The past three decades have witnessed the rapid development of machine translation, especially for data-driven approaches such as statistical machine translation (SMT) and neural machine translation (NMT). Due to the dominance of NMT at the present time, priority is given to collecting important, up-to-date NMT papers. The list is still incomplete and the categorization might be inappropriate. We will keep adding papers and improving the list. Any suggestions are welcome!

10 Must Reads
Statistical Machine Translation

Tutorials
Word-based Models
Phrase-based Models
Syntax-based Models
Discriminative Training
System Combination
Evaluation

Neural Machine Translation

Word/Phrase Constraints
Syntactic/Semantic Constraints
Coverage Constraints
Semi-supervised Methods
Unsupervised Methods
Pivot-based Methods
Data Augmentation Methods
Data Selection Methods
Transfer Learning & Multi-Task Learning Methods
Meta Learning Methods
Tutorials
Model Architecture
Attention Mechanism
Open Vocabulary and Character-based NMT
Training Objectives and Frameworks
Decoding
Low-resource Language Translation
Multilingual Language Translation
Prior Knowledge Integration
Document-level Translation
Robustness
Visualization and Interpretability
Fairness and Diversity
Efficiency
Speech Translation and Simultaneous Translation
Multi-modality
Pre-training
Domain Adaptation
Quality Estimation
Automatic Post-Editing
Word Translation and Bilingual Lexicon Induction
Poetry Translation

10 Must Reads

Peter E. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. 1993. The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics.
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a Method for Automatic Evaluation of Machine Translation. In Proceedings of ACL 2002.
Philipp Koehn, Franz J. Och, and Daniel Marcu. 2003. Statistical Phrase-Based Translation. In Proceedings of NAACL 2003.
Franz Josef Och. 2003. Minimum Error Rate Training in Statistical Machine Translation. In Proceedings of ACL 2003.
David Chiang. 2007. Hierarchical Phrase-Based Translation. Computational Linguistics.
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. In Proceedings of NIPS 2014.
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of ICLR 2015.
Diederik P. Kingma, Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of ICLR 2015.
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural Machine Translation of Rare Words with Subword Units. In Proceedings of ACL 2016.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You Need. In Proceedings of NIPS 2017.