Context-aware translation can be achieved by processing a concatenation of consecutive sentences with the standard Transformer architecture. This paper investigates the intuitive idea of providing the model with explicit information about the position of the sentences contained in the concatenation window. We compare various methods to encode sentence positions into token representations, including novel methods. Our results show that the Transformer benefits from certain sentence position encoding methods on English to Russian translation if trained with a context-discounted loss (Lupo et al., 2022). However, the same benefits are not observed in English to German. Further empirical efforts are necessary to define the conditions under which the proposed approach is beneficial.
翻译:本文研究了一种直觉上合理的方法:将关于的句子在拼接窗口中的位置提供给模型。我们比较了多种将句子位置编码到标记表示的方法,包括一些新颖的方法。我们的实验结果表明,如果使用上下文折扣损失函数进行训练(Lupo等人,2022年),则我们提出的部分句子位置编码方法对英至俄语翻译有益,但对英至德语翻译则不具有同样的效果。因此,需要进一步实验来确定提出的方法的应用范围。