In NMT, how far can we get without attention and without separate encoding and decoding? To answer that question, we introduce a recurrent neural translation model that does not use attention and does not have a separate encoder and decoder. Our eager translation model is low-latency, writing target tokens as soon as it reads the first source token, and uses constant memory during decoding. It performs on par with the standard attention-based model of Bahdanau et al. (2014), and better on long sentences.
翻译:在NMT中,没有关注,没有单独的编码和解码,我们还能走多远?为了回答这个问题,我们引入了一个不使用注意的经常性神经翻译模型,没有单独的编码器和解码器。 我们热切的翻译模型是低延迟的,一读到第一个源符号就刻写目标符号,在解码过程中使用恒定的记忆。它与Bahdanau等人(2014年)的标准关注模型相同,长刑期更好。