The integration of syntactic structures into Transformer machine translation has shown positive results, but to our knowledge, no work has attempted to do so with semantic structures. In this work we propose two novel parameter-free methods for injecting semantic information into Transformers, both rely on semantics-aware masking of (some of) the attention heads. One such method operates on the encoder, through a Scene-Aware Self-Attention (SASA) head. Another on the decoder, through a Scene-Aware Cross-Attention (SACrA) head. We show a consistent improvement over the vanilla Transformer and syntax-aware models for four language pairs. We further show an additional gain when using both semantic and syntactic structures in some language pairs.
翻译:将合成结构纳入变换机翻译的工作取得了积极的成果,但据我们所知,在语义结构方面没有试图这样做。 在这项工作中,我们提出了两种无参数向变换器输入语义信息的新颖方法,这两种方法都依靠(某些)注意头部的语义识别掩码。一种方法通过Sceen-Aware自省(SASA)头部在编码器上操作。另一种方法通过Sceen-Aware自省(SASA)头部在解码器上操作。我们对四种语言配对的香草变换器和通识模型表现出了一致的改进。我们进一步展示了在某些语言配对中使用语义和合成结构的附加收益。