SimA: 视觉变形器的简单无软式注意 (SimA: Simple Softmax-free Attention for Vision Transformers) - 专知论文

会员服务 ·

0

Attention · SimPLe · 块 · Softmax · 层 ·

2022 年 6 月 17 日

SimA: Simple Softmax-free Attention for Vision Transformers

翻译：SimA: 视觉变形器的简单无软式注意

Soroush Abbasi Koohpayegani,Hamed Pirsiavash

from arxiv, Code is available here: $\href{https://github.com/UCDvision/sima}{\text{This https URL}}$

Recently, vision transformers have become very popular. However, deploying them in many applications is computationally expensive partly due to the Softmax layer in the attention block. We introduce a simple but effective, Softmax-free attention block, SimA, which normalizes query and key matrices with simple $\ell_1$-norm instead of using Softmax layer. Then, the attention block in SimA is a simple multiplication of three matrices, so SimA can dynamically change the ordering of the computation at the test time to achieve linear computation on the number of tokens or the number of channels. We empirically show that SimA applied to three SOTA variations of transformers, DeiT, XCiT, and CvT, results in on-par accuracy compared to the SOTA models, without any need for Softmax layer. Interestingly, changing SimA from multi-head to single-head has only a small effect on the accuracy, which simplifies the attention block further. The code is available here: $\href{https://github.com/UCDvision/sima}{\text{This https URL}}$

翻译：最近,视觉变压器变得非常流行。但是, 在许多应用中部署视觉变压器的计算成本很高, 部分是由于关注区中的软形层。我们引入了一个简单但有效的软式无关注区块, 即SimA, 它使查询和关键矩阵正常化, 使用简单的$\ ell_ $1$- norm, 而不是使用软体层。然后, SimA 中的注意区块是一个简单的三个矩阵的乘法, 所以SimA 可以动态地改变测试时的计算顺序, 以便实现对标志数或频道数的线性计算。我们从经验上显示, SimA 应用到三个变压器( DeiT, XiT, 和 CvT) 的SOT 变换式, 与SOTA 模型相比结果的精确度是相同的, 无需使用软体层。有趣的是, 将SimA从多头变换成单头对精确度作用很小, 使关注区块更简单。。代码在这里可以查到 : $hrfef{ http://githoubb.com/ UCDif/ / mmus/ simmax {

0

相关内容

Attention

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

注意力机制介绍，Attention Mechanism

注意力机制介绍，Attention Mechanism

专知会员服务

171+阅读 · 2019年10月13日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

论文共读 | Attention is All You Need

论文共读 | Attention is All You Need

黑龙江大学自然语言处理实验室

14+阅读 · 2017年9月7日

浙闽沿岸海域跨陆架输运的特征和动力机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

炎症因子通过Rictor调控肾癌转移的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

铁基超导体的多重元素替代：能带填充、电子关联及无序效应对Tc的影响

国家自然科学基金

0+阅读 · 2013年12月31日

微通道内超临界二氧化碳流动与换热基础研究

国家自然科学基金

0+阅读 · 2013年12月31日

CuO/Cu2O纳米线与银混合结构在光电转换中激子输运特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

大偏航角状态下无铁永磁平面电机的电磁力建模及解耦控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

一维CuInS2-ZnS异质结构纳米材料的合成和光电性质

国家自然科学基金

0+阅读 · 2012年12月31日

纳米结构碳材料的辐照效应研究

国家自然科学基金

0+阅读 · 2011年12月31日

磷离子注入制备高电导率n型纳米金刚石薄膜的研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于微纳米结构的层状类钙钛矿阴极材料的制备及氧还原机制

国家自然科学基金

0+阅读 · 2009年12月31日

IVT: An End-to-End Instance-guided Video Transformer for 3D Pose Estimation

Arxiv

0+阅读 · 2022年8月6日

RoFormer: Enhanced Transformer with Rotary Position Embedding

Arxiv

0+阅读 · 2022年8月5日

Reference-based Image Super-Resolution with Deformable Attention Transformer

Arxiv

0+阅读 · 2022年8月4日

On the Connection between Local Attention and Dynamic Depth-wise Convolution

Arxiv

0+阅读 · 2022年8月4日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

44+阅读 · 2022年4月16日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Additive Margin Softmax for Face Verification

Arxiv

11+阅读 · 2018年1月18日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

VIP会员

文章信息

相关主题

相关VIP内容

自然语言处理顶会NAACL2022最佳论文出炉！

自然语言处理顶会NAACL2022最佳论文出炉！

专知会员服务

43+阅读 · 2022年6月30日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

注意力机制介绍，Attention Mechanism

注意力机制介绍，Attention Mechanism

专知会员服务

171+阅读 · 2019年10月13日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据驱动死亡：以色列AI战争机器如何锁定目标

【普林斯顿博士论文】通过以人为本的评估推动负责任的人工智能

ICML 2025 | BiAssemble: 双臂机器人几何拼合问题的协同可供性学习

ICML 2025杰出论文出炉：8篇获奖，南大研究者榜上有名

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

论文共读 | Attention is All You Need

论文共读 | Attention is All You Need

黑龙江大学自然语言处理实验室

14+阅读 · 2017年9月7日

相关论文

IVT: An End-to-End Instance-guided Video Transformer for 3D Pose Estimation

Arxiv

0+阅读 · 2022年8月6日

RoFormer: Enhanced Transformer with Rotary Position Embedding

Arxiv

0+阅读 · 2022年8月5日

Reference-based Image Super-Resolution with Deformable Attention Transformer

Arxiv

0+阅读 · 2022年8月4日

On the Connection between Local Attention and Dynamic Depth-wise Convolution

Arxiv

0+阅读 · 2022年8月4日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

44+阅读 · 2022年4月16日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Text Generation from Knowledge Graphs with Graph Transformers

Arxiv

35+阅读 · 2019年4月4日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Additive Margin Softmax for Face Verification

Arxiv

11+阅读 · 2018年1月18日

Attention Is All You Need

Arxiv

27+阅读 · 2017年12月6日

相关基金

浙闽沿岸海域跨陆架输运的特征和动力机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

炎症因子通过Rictor调控肾癌转移的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

铁基超导体的多重元素替代：能带填充、电子关联及无序效应对Tc的影响

国家自然科学基金

0+阅读 · 2013年12月31日

微通道内超临界二氧化碳流动与换热基础研究

国家自然科学基金

0+阅读 · 2013年12月31日

CuO/Cu2O纳米线与银混合结构在光电转换中激子输运特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

大偏航角状态下无铁永磁平面电机的电磁力建模及解耦控制研究

国家自然科学基金

0+阅读 · 2012年12月31日

一维CuInS2-ZnS异质结构纳米材料的合成和光电性质

国家自然科学基金

0+阅读 · 2012年12月31日

纳米结构碳材料的辐照效应研究

国家自然科学基金

0+阅读 · 2011年12月31日

磷离子注入制备高电导率n型纳米金刚石薄膜的研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于微纳米结构的层状类钙钛矿阴极材料的制备及氧还原机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员