Unlike literal expressions, idioms' meanings do not directly follow from their parts, posing a challenge for neural machine translation (NMT). NMT models are often unable to translate idioms accurately and over-generate compositional, literal translations. In this work, we investigate whether the non-compositionality of idioms is reflected in the mechanics of the dominant NMT model, Transformer, by analysing the hidden states and attention patterns for models with English as source language and one of seven European languages as target language. When Transformer emits a non-literal translation - i.e. identifies the expression as idiomatic - the encoder processes idioms more strongly as single lexical units compared to literal expressions. This manifests in idioms' parts being grouped through attention and in reduced interaction between idioms and their context. In the decoder's cross-attention, figurative inputs result in reduced attention on source-side tokens. These results suggest that Transformer's tendency to process idioms as compositional expressions contributes to literal translations of idioms.
翻译:不同字面表达式不同, 语义的含义并不直接随部分而来, 给神经机翻译( NMT) 带来挑战。 NMT 模型往往无法准确翻译语系和超精化组成、 字面翻译。 在这项工作中, 我们通过分析以英语作为源语言和以七种欧洲语言作为目标语言之一的模型的隐藏状态和注意力模式, 来调查突变器的不共性是否反映在占支配地位的NMT模型的机械学中。 当变换器发出一种非立体翻译时 - 即将语系识别为异性 - 与字面表达式相比, 更强烈地将语系转换为单一词系单位。 这表现在语系的部部分中, 通过注意和减少语系与其上下文的交互作用来组合。 在解密器的交叉感中, 具有比喻性的投入减少了对源面象征物的注意。 这些结果表明, 变换式者倾向于将语系作为单面表达式表达式表达式的处理。