神经Transformer中替代性位置编码函数研究 (Alternative positional encoding functions for neural transformers) - 专知论文

会员服务 ·

0

位置编码 · Transformer · 替代函数 · 深度架构 · 周期的 ·

Alternative positional encoding functions for neural transformers

翻译：神经Transformer中替代性位置编码函数研究

Ezequiel Lopez-Rubio,Macoris Decena-Gimenez,Rafael Marcos Luque-Baena

A key module in neural transformer-based deep architectures is positional encoding. This module enables a suitable way to encode positional information as input for transformer neural layers. This success has been rooted in the use of sinusoidal functions of various frequencies, in order to capture recurrent patterns of differing typical periods. In this work, an alternative set of periodic functions is proposed for positional encoding. These functions preserve some key properties of sinusoidal ones, while they depart from them in fundamental ways. Some tentative experiments are reported, where the original sinusoidal version is substantially outperformed. This strongly suggests that the alternative functions may have a wider use in other transformer architectures.

翻译：在基于神经Transformer的深度架构中，位置编码模块是关键组件。该模块能够将位置信息编码为Transformer神经层的合适输入形式。现有方法的成功源于采用不同频率的正弦函数，以捕捉具有不同典型周期的重复模式。本研究提出了一组用于位置编码的替代性周期函数。这些函数在保留正弦函数关键性质的同时，在本质上与其存在显著差异。初步实验结果表明，所提出的替代函数在性能上显著优于原始正弦函数版本。这强烈表明替代函数在其他Transformer架构中可能具有更广泛的应用前景。

0

相关内容

位置编码

【KDD2024】面向课程图稀疏化的轻量级图神经网络搜索

【KDD2024】面向课程图稀疏化的轻量级图神经网络搜索

专知会员服务

18+阅读 · 2024年6月25日

【NeurIPS2023】神经预测与对齐的谱理论

【NeurIPS2023】神经预测与对齐的谱理论

专知会员服务

18+阅读 · 2023年9月28日

【Pisa大学博士论文】图贝叶斯深度学习，Bayesian Deep Learning for Graphs ，201页pdf

【Pisa大学博士论文】图贝叶斯深度学习，Bayesian Deep Learning for Graphs ，201页pdf

专知会员服务

47+阅读 · 2022年2月28日

AAAI 2022 | ProtGNN：自解释图神经网络

AAAI 2022 | ProtGNN：自解释图神经网络

专知会员服务

40+阅读 · 2022年2月28日

【NeurIPS 2020】核基渐进蒸馏加法器神经网络

专知会员服务

18+阅读 · 2020年10月18日

AAAI 2022 | ProtGNN：自解释图神经网络

AAAI 2022 | ProtGNN：自解释图神经网络

专知

10+阅读 · 2022年2月28日

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

专知

11+阅读 · 2021年4月23日

多模态视觉语言表征学习研究综述

多模态视觉语言表征学习研究综述

专知

27+阅读 · 2020年12月3日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

【论文笔记】用于数据驱动交通预测的扩散卷积循环神经网络（DCRNN）

【论文笔记】用于数据驱动交通预测的扩散卷积循环神经网络（DCRNN）

专知

173+阅读 · 2019年10月28日

稀疏信号驱动的时间序列信号盲分离优化模型及算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于概率图谱引导的群组自适应时序脑MR图像脑提取方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于三维结构和复杂网络的Hub蛋白质的功能研究

国家自然科学基金

0+阅读 · 2015年12月31日

非参数核方法的样本外扩展研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

Neural Implicit Heart Coordinates: 3D cardiac shape reconstruction from sparse segmentations

Arxiv

0+阅读 · 12月23日

Brain-Grounded Axes for Reading and Steering LLM States

Arxiv

0+阅读 · 12月22日

Brain-language fusion enables interactive neural readout and in-silico experimentation

Arxiv

0+阅读 · 12月22日

An Empirical Study of Sampling Hyperparameters in Diffusion-Based Super-Resolution

Arxiv

0+阅读 · 12月19日

On the performance of multi-fidelity and reduced-dimensional neural emulators for inference of physiological boundary conditions

Arxiv

0+阅读 · 12月19日

VIP会员

文章信息

相关主题

相关VIP内容

【KDD2024】面向课程图稀疏化的轻量级图神经网络搜索

【KDD2024】面向课程图稀疏化的轻量级图神经网络搜索

专知会员服务

18+阅读 · 2024年6月25日

【NeurIPS2023】神经预测与对齐的谱理论

【NeurIPS2023】神经预测与对齐的谱理论

专知会员服务

18+阅读 · 2023年9月28日

【Pisa大学博士论文】图贝叶斯深度学习，Bayesian Deep Learning for Graphs ，201页pdf

【Pisa大学博士论文】图贝叶斯深度学习，Bayesian Deep Learning for Graphs ，201页pdf

专知会员服务

47+阅读 · 2022年2月28日

AAAI 2022 | ProtGNN：自解释图神经网络

AAAI 2022 | ProtGNN：自解释图神经网络

专知会员服务

40+阅读 · 2022年2月28日

【NeurIPS 2020】核基渐进蒸馏加法器神经网络

专知会员服务

18+阅读 · 2020年10月18日

热门VIP内容

开通专知VIP会员享更多权益服务

【书籍】从零开始构建文本生成图像生成器：基于 Transformers 与扩散模型

人工智能与未来指挥

【伯克利博士论文】将大语言模型绑定至虚拟人格：实现人类行为模拟

稀疏自编码器综述：解释大语言模型的内部机制

相关资讯

AAAI 2022 | ProtGNN：自解释图神经网络

AAAI 2022 | ProtGNN：自解释图神经网络

专知

10+阅读 · 2022年2月28日

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

专知

11+阅读 · 2021年4月23日

多模态视觉语言表征学习研究综述

多模态视觉语言表征学习研究综述

专知

27+阅读 · 2020年12月3日

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

【CVPR2020-旷视】DPGN：分布传播图网络的小样本学习

专知

13+阅读 · 2020年4月1日

【论文笔记】用于数据驱动交通预测的扩散卷积循环神经网络（DCRNN）

【论文笔记】用于数据驱动交通预测的扩散卷积循环神经网络（DCRNN）

专知

173+阅读 · 2019年10月28日

相关论文

Neural Implicit Heart Coordinates: 3D cardiac shape reconstruction from sparse segmentations

Arxiv

0+阅读 · 12月23日

Brain-Grounded Axes for Reading and Steering LLM States

Arxiv

0+阅读 · 12月22日

Brain-language fusion enables interactive neural readout and in-silico experimentation

Arxiv

0+阅读 · 12月22日

An Empirical Study of Sampling Hyperparameters in Diffusion-Based Super-Resolution

Arxiv

0+阅读 · 12月19日

On the performance of multi-fidelity and reduced-dimensional neural emulators for inference of physiological boundary conditions

Arxiv

0+阅读 · 12月19日

相关基金

稀疏信号驱动的时间序列信号盲分离优化模型及算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于概率图谱引导的群组自适应时序脑MR图像脑提取方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于三维结构和复杂网络的Hub蛋白质的功能研究

国家自然科学基金

0+阅读 · 2015年12月31日

非参数核方法的样本外扩展研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

微信扫码咨询专知VIP会员