百象器: 微粒因素工程中比点- 产品自控更好的混合链接参数 (Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention) - 专知论文

会员服务 ·

0

分解的 · Performer · Better · 块 · 稀疏 ·

2022 年 4 月 22 日

Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention

翻译：百象器: 微粒因素工程中比点- 产品自控更好的混合链接参数

Tong Yu,Ruslan Khalitov,Lei Cheng,Zhirong Yang

from arxiv, 10 pages, 5 figures, accepted by CVPR2022

Self-Attention is a widely used building block in neural modeling to mix long-range data elements. Most self-attention neural networks employ pairwise dot-products to specify the attention coefficients. However, these methods require $O(N^2)$ computing cost for sequence length $N$. Even though some approximation methods have been introduced to relieve the quadratic cost, the performance of the dot-product approach is still bottlenecked by the low-rank constraint in the attention matrix factorization. In this paper, we propose a novel scalable and effective mixing building block called Paramixer. Our method factorizes the interaction matrix into several sparse matrices, where we parameterize the non-zero entries by MLPs with the data elements as input. The overall computing cost of the new building block is as low as $O(N \log N)$. Moreover, all factorizing matrices in Paramixer are full-rank, so it does not suffer from the low-rank bottleneck. We have tested the new method on both synthetic and various real-world long sequential data sets and compared it with several state-of-the-art attention networks. The experimental results show that Paramixer has better performance in most learning tasks.

翻译：自控是神经模型中广泛使用的一个构件,将远程数据元素混合在一起。大多数自控神经网络使用对称点产品来指定注意系数。然而,这些方法要求序列长度的计算成本为$O(N2)2美元。尽管已经采用一些近似方法来缓解二次成本,但点产品方法的性能仍然受到关注矩阵因素分化中低级别制约的阻碍。在本文中,我们提议了一个新的可缩放和有效的混合构件块,叫做帕拉密斯特。我们的方法将互动矩阵分为几个稀疏矩阵,我们用数据元素作为输入来对MLP的非零条目进行参数参数化。新构件的总体计算成本与美元(N\logN)相比低。此外,Palmixer中所有因数矩阵都完全处于低级别,因此不受到低级别瓶的损害。我们已经在合成和各种真实世界的长期连续数据集中测试了新方法。我们用数据序列模型对它进行参数化的参数化,并将它与数项实验性结果进行比较。

0

相关内容

分解的

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

中低温固体氧化物燃料电池核-壳结构阴极的研究

国家自然科学基金

0+阅读 · 2014年12月31日

superstrate结构铜锌硒硫太阳电池制备中的关键科学问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

粗糙核奇异积分算子的若干问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

Diversin介导非小细胞肺癌长春瑞滨耐药的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

La1-xSrxMnO3/In-MgZnO全氧化物外延异质结器件的制备与性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Tm,Ho晶体的种子光注入锁定可连续调谐2μ#21333;频激光器

国家自然科学基金

0+阅读 · 2008年12月31日

强流发射纳米碳管阴极研究

国家自然科学基金

0+阅读 · 2008年12月31日

用dsDNA微阵列筛选NF-κDNA靶点及靶基因

国家自然科学基金

0+阅读 · 2008年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

NAGphormer: Neighborhood Aggregation Graph Transformer for Node Classification in Large Graphs

Arxiv

0+阅读 · 2022年6月10日

On the Parameter Combinations That Matter and on Those That do Not

Arxiv

0+阅读 · 2022年6月9日

The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training

Arxiv

0+阅读 · 2022年6月9日

Learning Pruned Structure and Weights Simultaneously from Scratch: an Attention based Approach

Arxiv

0+阅读 · 2022年6月8日

How Do Vision Transformers Work?

Arxiv

1+阅读 · 2022年6月8日

$\mathcal{L}_2$-optimal Reduced-order Modeling Using Parameter-separable Forms

Arxiv

0+阅读 · 2022年6月6日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Arxiv

17+阅读 · 2018年6月5日

Question Answering over Freebase via Attentive RNN with Similarity Matrix based CNN

Arxiv

11+阅读 · 2018年5月27日

Self-Attention with Relative Position Representations

Arxiv

27+阅读 · 2018年4月12日

VIP会员

文章信息

相关主题

相关VIP内容

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

《商用大语言模型的升级风险管理：国家安全运用》

自主人工智能：未来战争是否将是自主化的？

《从装备到文化：美陆军技术素养建设启示录》最新报告

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

NAGphormer: Neighborhood Aggregation Graph Transformer for Node Classification in Large Graphs

Arxiv

0+阅读 · 2022年6月10日

On the Parameter Combinations That Matter and on Those That do Not

Arxiv

0+阅读 · 2022年6月9日

The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training

Arxiv

0+阅读 · 2022年6月9日

Learning Pruned Structure and Weights Simultaneously from Scratch: an Attention based Approach

Arxiv

0+阅读 · 2022年6月8日

How Do Vision Transformers Work?

Arxiv

1+阅读 · 2022年6月8日

$\mathcal{L}_2$-optimal Reduced-order Modeling Using Parameter-separable Forms

Arxiv

0+阅读 · 2022年6月6日

Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs

Arxiv

40+阅读 · 2019年6月4日

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Arxiv

17+阅读 · 2018年6月5日

Question Answering over Freebase via Attentive RNN with Similarity Matrix based CNN

Arxiv

11+阅读 · 2018年5月27日

Self-Attention with Relative Position Representations

Arxiv

27+阅读 · 2018年4月12日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

中低温固体氧化物燃料电池核-壳结构阴极的研究

国家自然科学基金

0+阅读 · 2014年12月31日

superstrate结构铜锌硒硫太阳电池制备中的关键科学问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

粗糙核奇异积分算子的若干问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

Diversin介导非小细胞肺癌长春瑞滨耐药的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

La1-xSrxMnO3/In-MgZnO全氧化物外延异质结器件的制备与性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Tm,Ho晶体的种子光注入锁定可连续调谐2μ#21333;频激光器

国家自然科学基金

0+阅读 · 2008年12月31日

强流发射纳米碳管阴极研究

国家自然科学基金

0+阅读 · 2008年12月31日

用dsDNA微阵列筛选NF-κDNA靶点及靶基因

国家自然科学基金

0+阅读 · 2008年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员