谷歌开源 tf-seq2seq，你也能用谷歌翻译的框架训练模型

会员服务 ·

谷歌开源 tf-seq2seq，你也能用谷歌翻译的框架训练模型

2017 年 4 月 12 日 新智元

C新智元编译

来源：Google Research

译者：文强

【新智元导读】谷歌今天宣布开源 tf-seq2seq，这是一个用于 Tensorflow 的通用编码器-解码器框架，可用于机器翻译、文本总结、会话建模、图说生成等任何序列到序列的任务。

2016年，我们宣布了谷歌神经机器翻译（GNMT），一个序列到序列（“seq2seq”）模型，现在用于谷歌翻译商用系统。虽然 GNMT 在翻译质量方面取得了巨大的进步，但影响却十分有限，主要是外部研究人员无法使用这一框架训练模型。

今天，我们很高兴向大家介绍 tf-seq2seq，这是一个TensorFlow开源代码seq2seq框架，使用seq2seq模型可以很容易地进行实验，并获得最先进的结果。为此，我们使tf-seq2seq代码库干净（clean）和模块化，记录了完整的测试情况和所有功能。

我们的框架支持标准seq2seq模型的各种配置，如编码器/解码器的深度（depth of the encoder/decode），注意力机制，RNN单元类型或 beam 大小。这种多功能性使我们能够发现最佳的超参数并且在性能上优于其他框架，如我们的文章《神经机器翻译架构的大规模探索》（Massive Exploration of Neural Machine Translation Architectures）所述。

论文地址：https://arxiv.org/abs/1703.03906

从汉语普通话到英语的seq2seq模型。在每个时间步长中，编码器接收一个汉字和自己以前的状态（黑色箭头表示），并产生一个输出向量（用蓝色箭头表示）。然后，解码器逐个生成英文翻译，每一步都会综合考虑最后一个字、先前的状态和编码器的所有输出的加权组合（也就是注意力[3]，用蓝色标记），然后产生下一个英文单词。我们在实现中，使用 wordpieces[4]处理罕见的单词。

除了机器翻译之外，tf-seq2seq还可以应用于任何其他的序列到序列任务（即，学习产生给定输入序列的输出序列），包括机器文本总结（machine summarization）、图说生成、语音识别和会话建模（conversational modeling）。我们仔细设计了框架来保持其通用性（generality），并提供教程，预处理数据和其他机器翻译实用程序。

希望使用 tf-seq2seq能加速（或开始）你自己的深度学习研究。也欢迎你对我们的GitHub库做贡献。

Github 库：https://github.com/google/seq2seq

参考资料：

[1] Massive Exploration of Neural Machine Translation Architectures, Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc Le

[2] Sequence to Sequence Learning with Neural Networks, Ilya Sutskever, Oriol Vinyals, Quoc V. Le. NIPS, 2014

[3] Neural Machine Translation by Jointly Learning to Align and Translate, Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. ICLR, 2015

[4] Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation, Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, Jeffrey Dean. Technical Report, 2016

[5] Attention and Augmented Recurrent Neural Networks, Chris Olah, Shan Carter. Distill, 2016

[6] Neural Machine Translation and Sequence-to-sequence Models: A Tutorial, Graham Neubig

[7] Sequence-to-Sequence Models, TensorFlow.org

原文链接：https://research.googleblog.com/2017/04/introducing-tf-seq2seq-open-source.html