Transformers are arguably the main workhorse in recent Natural Language Processing research. By definition a Transformer is invariant with respect to reordering of the input. However, language is inherently sequential and word order is essential to the semantics and syntax of an utterance. In this article, we provide an overview and theoretical comparison of existing methods to incorporate position information into Transformer models. The objectives of this survey are to (1) showcase that position information in Transformer is a vibrant and extensive research area; (2) enable the reader to compare existing methods by providing a unified notation and systematization of different approaches along important model dimensions; (3) indicate what characteristics of an application should be taken into account when selecting a position encoding; (4) provide stimuli for future research.
翻译:根据定义,变换器在对输入进行重新排序方面是无差别的。然而,语言本质上是顺序的,单词顺序对语义和语法至关重要。在本条中,我们对将定位信息纳入变换器模型的现有方法进行了概述和理论比较。本调查的目标是:(1) 展示变换器中的位置信息是一个充满活力和广泛的研究领域;(2) 使读者能够比较现有方法,按照重要的模型层面统一说明和系统化不同方法;(3) 在选择位置编码时,说明应用的特点;(4) 为今后的研究提供刺激。