重新思考位置编码 (Rethinking Positional Encoding) - 专知论文

会员服务 ·

0

位置编码 · INFORMS · Performer · 可理解性 · CASE ·

2021 年 7 月 13 日

Rethinking Positional Encoding

翻译：重新思考位置编码

Jianqiao Zheng,Sameera Ramasinghe,Simon Lucey

It is well noted that coordinate based MLPs benefit greatly -- in terms of preserving high-frequency information -- through the encoding of coordinate positions as an array of Fourier features. Hitherto, the rationale for the effectiveness of these positional encodings has been solely studied through a Fourier lens. In this paper, we strive to broaden this understanding by showing that alternative non-Fourier embedding functions can indeed be used for positional encoding. Moreover, we show that their performance is entirely determined by a trade-off between the stable rank of the embedded matrix and the distance preservation between embedded coordinates. We further establish that the now ubiquitous Fourier feature mapping of position is a special case that fulfills these conditions. Consequently, we present a more general theory to analyze positional encoding in terms of shifted basis functions. To this end, we develop the necessary theoretical formulae and empirically verify that our theoretical claims hold in practice. Codes available at https://github.com/osiriszjq/Rethinking-positional-encoding.

翻译：众所周知,基于协调的 MLP 通过将坐标位置编码成一连串的Fourier 特征,使基于协调的 MLP 大大受益 -- -- 在保存高频信息方面 -- -- 通过将坐标位置编码为一连串的Fourier 特征。在此之前,这些位置编码有效性的理由完全通过Fourier 的透镜来研究。在本文中,我们努力扩大这一认识,显示其他非四级嵌入功能确实可用于定位编码。此外,我们表明,它们的性能完全取决于嵌入式矩阵稳定级别与嵌入坐标之间的距离保护之间的权衡。我们还进一步确定,现在普遍存在的四级定位特征映射是一个满足这些条件的特殊案例。因此,我们提出了一个更笼统的理论,用改变的基础功能来分析位置编码。为此,我们开发了必要的理论公式,并从经验上核实我们的理论主张在实践中是否有效。代码可在 https://github.com/osirsjq/Re thinking-posial-encoding 中查阅。

0

相关内容

位置编码

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

简明扼要！Python教程手册，206页pdf

简明扼要！Python教程手册，206页pdf

专知会员服务

48+阅读 · 2020年3月24日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec智能推荐

5+阅读 · 2019年7月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Python · SVM（三）· 核方法

Python · SVM（三）· 核方法

机器学习研究会

7+阅读 · 2017年8月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Encoding and Decoding with Partitioned Complementary Sequences for Low-PAPR OFDM

Arxiv

0+阅读 · 2021年9月15日

Mixup Decoding for Diverse Machine Translation

Arxiv

0+阅读 · 2021年9月14日

Relative Positional Encoding for Transformers with Linear Complexity

Arxiv

8+阅读 · 2021年5月18日

SparseBERT: Rethinking the Importance Analysis in Self-attention

SparseBERT: Rethinking the Importance Analysis in Self-attention

Arxiv

7+阅读 · 2021年2月25日

Rethinking Attention with Performers

Arxiv

3+阅读 · 2020年9月30日

Rethinking Positional Encoding in Language Pre-training

Arxiv

4+阅读 · 2020年7月9日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Arxiv

3+阅读 · 2019年5月28日

Star-Transformer

Star-Transformer

Arxiv

5+阅读 · 2019年2月28日

Rethinking ImageNet Pre-training

Arxiv

8+阅读 · 2018年11月21日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

简明扼要！Python教程手册，206页pdf

简明扼要！Python教程手册，206页pdf

专知会员服务

48+阅读 · 2020年3月24日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机战争时代的战时法：大国竞争中的区分原则、相称性原则与行动建议》最新75页

《构建强健军事力量的设计挑战：提升海军兵力支持系统效能的多分辨率建模方法》69页

正视无人机心理战：恐惧效应与战略反思

《精确反蜂群防御系统：三维运动探测与定向空爆拦截技术融合》最新24页

相关资讯

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec 精选：EfficientNet、XLNet 论文及代码实现

LibRec智能推荐

5+阅读 · 2019年7月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】(Python)SVM数据分类

【学习】(Python)SVM数据分类

机器学习研究会

6+阅读 · 2017年10月15日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Python · SVM（三）· 核方法

Python · SVM（三）· 核方法

机器学习研究会

7+阅读 · 2017年8月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Encoding and Decoding with Partitioned Complementary Sequences for Low-PAPR OFDM

Arxiv

0+阅读 · 2021年9月15日

Mixup Decoding for Diverse Machine Translation

Arxiv

0+阅读 · 2021年9月14日

Relative Positional Encoding for Transformers with Linear Complexity

Arxiv

8+阅读 · 2021年5月18日

SparseBERT: Rethinking the Importance Analysis in Self-attention

SparseBERT: Rethinking the Importance Analysis in Self-attention

Arxiv

7+阅读 · 2021年2月25日

Rethinking Attention with Performers

Arxiv

3+阅读 · 2020年9月30日

Rethinking Positional Encoding in Language Pre-training

Arxiv

4+阅读 · 2020年7月9日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Arxiv

3+阅读 · 2019年5月28日

Star-Transformer

Star-Transformer

Arxiv

5+阅读 · 2019年2月28日

Rethinking ImageNet Pre-training

Arxiv

8+阅读 · 2018年11月21日

微信扫码咨询专知VIP会员