A key module in neural transformer-based deep architectures is positional encoding. This module enables a suitable way to encode positional information as input for transformer neural layers. This success has been rooted in the use of sinusoidal functions of various frequencies, in order to capture recurrent patterns of differing typical periods. In this work, an alternative set of periodic functions is proposed for positional encoding. These functions preserve some key properties of sinusoidal ones, while they depart from them in fundamental ways. Some tentative experiments are reported, where the original sinusoidal version is substantially outperformed. This strongly suggests that the alternative functions may have a wider use in other transformer architectures.
翻译:在基于神经Transformer的深度架构中,位置编码模块是关键组件。该模块能够将位置信息编码为Transformer神经层的合适输入形式。现有方法的成功源于采用不同频率的正弦函数,以捕捉具有不同典型周期的重复模式。本研究提出了一组用于位置编码的替代性周期函数。这些函数在保留正弦函数关键性质的同时,在本质上与其存在显著差异。初步实验结果表明,所提出的替代函数在性能上显著优于原始正弦函数版本。这强烈表明替代函数在其他Transformer架构中可能具有更广泛的应用前景。