In this study, we first investigate a novel capsule network with dynamic routing for linear time Neural Machine Translation (NMT), referred as \textsc{CapsNMT}. \textsc{CapsNMT} uses an aggregation mechanism to map the source sentence into a matrix with pre-determined size, and then applys a deep LSTM network to decode the target sequence from the source representation. Unlike the previous work \cite{sutskever2014sequence} to store the source sentence with a passive and bottom-up way, the dynamic routing policy encodes the source sentence with an iterative process to decide the credit attribution between nodes from lower and higher layers. \textsc{CapsNMT} has two core properties: it runs in time that is linear in the length of the sequences and provides a more flexible way to select, represent and aggregates the part-whole information of the source sentence. On WMT14 English-German task and a larger WMT14 English-French task, \textsc{CapsNMT} achieves comparable results with the state-of-the-art NMT systems. To the best of our knowledge, this is the first work that capsule networks have been empirically investigated for sequence to sequence problems.
翻译:在此研究中,我们首先调查一个具有线性时间神经机器翻译动态路径的新小胶囊网络(NMT),称为\ textsc{CaptsNMT}。\ textsc{CapsNMT}使用一个聚合机制将源句映射成一个具有预先确定大小的矩阵,然后运用一个深 LSTM 网络从源代表处解码目标序列。不同于先前的工作\cite{sutever2014sequence}, 以被动和自下而上的方式存储源句, 动态路线政策将源句编码成一个迭接程序, 以决定下层和上层节点之间的信用归属。\ textsc{CaptsNMT} 有两个核心属性: 它在时间里运行, 时间长度为线性, 提供了选择、 代表并汇总源句部分内容信息的更灵活的方式。 WMT14 英德任务和更大的WMT14 英法任务, Ntextsc{CaptsNMT} 代码将源句编码编码编码编码编码编码编码成一个迭接合程序。, 我们的实验序列系统的第一个是实验序列系统。