需要什么样的浓度图自我关注? (What Dense Graph Do You Need for Self-Attention?) - 专知论文

会员服务 ·

0

图 · Sparse Transformer · INFORMS · Performer · 变换 ·

2022 年 5 月 31 日

What Dense Graph Do You Need for Self-Attention?

翻译：需要什么样的浓度图自我关注?

Yuxing Wang,Chu-Tak Lee,Qipeng Guo,Zhangyue Yin,Yunhua Zhou,Xuanjing Huang,Xipeng Qiu

from arxiv, Accepted by ICML 2022. Code is available at https://github.com/yxzwang/Normalized-Information-Payload

Transformers have made progress in miscellaneous tasks, but suffer from quadratic computational and memory complexities. Recent works propose sparse Transformers with attention on sparse graphs to reduce complexity and remain strong performance. While effective, the crucial parts of how dense a graph needs to be to perform well are not fully explored. In this paper, we propose Normalized Information Payload (NIP), a graph scoring function measuring information transfer on graph, which provides an analysis tool for trade-offs between performance and complexity. Guided by this theoretical analysis, we present Hypercube Transformer, a sparse Transformer that models token interactions in a hypercube and shows comparable or even better results with vanilla Transformer while yielding $O(N\log N)$ complexity with sequence length $N$. Experiments on tasks requiring various sequence lengths lay validation for our graph function well.

翻译：变异器在各种任务中取得了进步,但受到四级计算和记忆复杂性的影响。最近的工程在稀疏的图表上提出了稀疏的变异器,以降低复杂性并保持强劲的性能。虽然有效,但是没有充分探讨图需要高密度才能很好地运行的关键部分。在本文中,我们提出了标准化信息有效载荷(NIP),这是测量图上信息传输的图表评分功能,为业绩和复杂性之间的取舍提供了分析工具。在这种理论分析的指导下,我们提出了超立方变异器,这是一个稀疏的变异器,在超立方中代表互动,显示与香草变异器的类似或更好的结果,同时产生序列长度为$O(N)$(N)$(N)美元(N)美元(美元)的复杂情况。有关需要不同序列长度的任务的实验为我们图形功能提供了良好的验证。

0

相关内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

联唑类金属超分子的自发手性拆分、结构调控和铁电性质

国家自然科学基金

0+阅读 · 2014年12月31日

多级结构中空纳米纤维负载金基双金属催化剂的可控合成及其应用研究

国家自然科学基金

0+阅读 · 2014年12月31日

Cu/TiN核壳结构复合纳米纤维的可控合成及性能

国家自然科学基金

0+阅读 · 2013年12月31日

ING3：原发性肝癌的诊断与治疗新靶点

国家自然科学基金

0+阅读 · 2012年12月31日

藤黄酸抗B细胞非霍奇金淋巴瘤新机制- - 调控SRC-3/组蛋白乙酰化转录复合物SUMO化修饰

国家自然科学基金

0+阅读 · 2012年12月31日

纯手性金属有机骨架膜的制备及对映体选择性分离研究

国家自然科学基金

0+阅读 · 2011年12月31日

金属有机催化材料的可控自组装与性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

限制性饮食者对食物信息的注意偏向及脑机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

分子筛微囊组装均相手性催化剂研究

国家自然科学基金

0+阅读 · 2009年12月31日

Mg2Ni/PdAg双层复合膜PLD法制备及氢渗透性能研究

国家自然科学基金

0+阅读 · 2008年12月31日

ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer

Arxiv

0+阅读 · 2022年7月17日

Attention over Self-attention:Intention-aware Re-ranking with Dynamic Transformer Encoders for Recommendation

Arxiv

0+阅读 · 2022年7月15日

Graph Modularity: Towards Understanding the Cross-Layer Transition of Feature Representations in Deep Neural Networks

Arxiv

0+阅读 · 2022年7月14日

Reinforced Negative Sampling over Knowledge Graph for Recommendation

Arxiv

17+阅读 · 2020年3月12日

Graph Transformer Networks

Arxiv

15+阅读 · 2020年2月5日

Hyper-SAGNN: a self-attention based graph neural network for hypergraphs

Hyper-SAGNN: a self-attention based graph neural network for hypergraphs

Arxiv

17+阅读 · 2019年11月6日

Efficiently Embedding Dynamic Knowledge Graphs

Efficiently Embedding Dynamic Knowledge Graphs

Arxiv

14+阅读 · 2019年10月15日

Dynamic Graph Representation Learning via Self-Attention Networks

Arxiv

52+阅读 · 2019年6月15日

Self-Attention Graph Pooling

Self-Attention Graph Pooling

Arxiv

13+阅读 · 2019年6月13日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

VIP会员

文章信息

相关主题

Sparse Transformer

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

325+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《代码、指挥与冲突：描绘军事人工智能的未来》报告

【斯坦福博士论文】面向地理空间数据的多模态与多尺度建模：时空生成式人工智能

美国启动“自有军事人工智能计划”：采用谷歌Gemini以推动全军人工智能应用

《创新与适应性作为军事成功的关键因素：来自俄乌战争的战略洞见》报告

相关资讯

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer

Arxiv

0+阅读 · 2022年7月17日

Attention over Self-attention:Intention-aware Re-ranking with Dynamic Transformer Encoders for Recommendation

Arxiv

0+阅读 · 2022年7月15日

Graph Modularity: Towards Understanding the Cross-Layer Transition of Feature Representations in Deep Neural Networks

Arxiv

0+阅读 · 2022年7月14日

Reinforced Negative Sampling over Knowledge Graph for Recommendation

Arxiv

17+阅读 · 2020年3月12日

Graph Transformer Networks

Arxiv

15+阅读 · 2020年2月5日

Hyper-SAGNN: a self-attention based graph neural network for hypergraphs

Hyper-SAGNN: a self-attention based graph neural network for hypergraphs

Arxiv

17+阅读 · 2019年11月6日

Efficiently Embedding Dynamic Knowledge Graphs

Efficiently Embedding Dynamic Knowledge Graphs

Arxiv

14+阅读 · 2019年10月15日

Dynamic Graph Representation Learning via Self-Attention Networks

Arxiv

52+阅读 · 2019年6月15日

Self-Attention Graph Pooling

Self-Attention Graph Pooling

Arxiv

13+阅读 · 2019年6月13日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

相关基金

联唑类金属超分子的自发手性拆分、结构调控和铁电性质

国家自然科学基金

0+阅读 · 2014年12月31日

多级结构中空纳米纤维负载金基双金属催化剂的可控合成及其应用研究

国家自然科学基金

0+阅读 · 2014年12月31日

Cu/TiN核壳结构复合纳米纤维的可控合成及性能

国家自然科学基金

0+阅读 · 2013年12月31日

ING3：原发性肝癌的诊断与治疗新靶点

国家自然科学基金

0+阅读 · 2012年12月31日

藤黄酸抗B细胞非霍奇金淋巴瘤新机制- - 调控SRC-3/组蛋白乙酰化转录复合物SUMO化修饰

国家自然科学基金

0+阅读 · 2012年12月31日

纯手性金属有机骨架膜的制备及对映体选择性分离研究

国家自然科学基金

0+阅读 · 2011年12月31日

金属有机催化材料的可控自组装与性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

限制性饮食者对食物信息的注意偏向及脑机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

分子筛微囊组装均相手性催化剂研究

国家自然科学基金

0+阅读 · 2009年12月31日

Mg2Ni/PdAg双层复合膜PLD法制备及氢渗透性能研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员