视觉Transformer的决策流校准器：ViT-Calibrator (ViT-Calibrator: Decision Stream Calibration for Vision Transformer) - 专知论文

会员服务 ·

0

令牌 · 视觉Transformer · 关联 · Transformer · 模型架构 ·

2023 年 4 月 10 日

ViT-Calibrator: Decision Stream Calibration for Vision Transformer

翻译：视觉Transformer的决策流校准器：ViT-Calibrator

Lin Chen,Zhijie Jia,Tian Qiu,Lechao Cheng,Jie Lei,Zunlei Feng,Mingli Song

from arxiv, 14pages, 12 figures

A surge of interest has emerged in utilizing Transformers in diverse vision tasks owing to its formidable performance. However, existing approaches primarily focus on optimizing internal model architecture designs that often entail significant trial and error with high burdens. In this work, we propose a new paradigm dubbed Decision Stream Calibration that boosts the performance of general Vision Transformers. To achieve this, we shed light on the information propagation mechanism in the learning procedure by exploring the correlation between different tokens and the relevance coefficient of multiple dimensions. Upon further analysis, it was discovered that 1) the final decision is associated with tokens of foreground targets, while token features of foreground target will be transmitted into the next layer as much as possible, and the useless token features of background area will be eliminated gradually in the forward propagation. 2) Each category is solely associated with specific sparse dimensions in the tokens. Based on the discoveries mentioned above, we designed a two-stage calibration scheme, namely ViT-Calibrator, including token propagation calibration stage and dimension propagation calibration stage. Extensive experiments on commonly used datasets show that the proposed approach can achieve promising results. The source codes are given in the supplements.

翻译：近来，由于其惊人的性能，在各种视觉任务中利用Transformer引起了极大的兴趣。然而，现有的方法主要侧重于优化内部模型架构设计，这往往需要大量的尝试和高负担。在这项工作中，我们提出了一种名为决策流校准（Decision Stream Calibration）的新范式，以提高通用视觉Transformer的性能。为了实现这一目标，我们探讨了学习过程中的信息传递机制，研究了不同令牌之间的关联性以及多个维度的相关系数。通过进一步的分析，发现：1）最终决策与前景目标的令牌相关联，而前景目标的令牌特征将尽可能传递到下一层，而背景区域的无用令牌特征将逐渐消失。2）每个类别仅与令牌中特定的稀疏维度相关联。基于上述发现，我们设计了一个两阶段校准方案，即ViT-Calibrator，包括令牌传播校准阶段和维度传播校准阶段。在常用的数据集上进行了大量实验证明了所提出方法的效果优异。该论文附带了源代码。

0

相关内容

【决策Transformers 导论】Introducing Decision Transformers on Hugging Face 🤗

【决策Transformers 导论】Introducing Decision Transformers on Hugging Face 🤗

专知会员服务

67+阅读 · 2022年3月29日

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

专知会员服务

18+阅读 · 2022年3月19日

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

17+阅读 · 2022年3月19日

ICCV 2021最佳论文出炉！微软Swin Transformer摘得马尔奖

ICCV 2021最佳论文出炉！微软Swin Transformer摘得马尔奖

专知会员服务

30+阅读 · 2021年10月13日

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

专知会员服务

124+阅读 · 2020年12月7日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

知识图谱嵌入模型的概率标定,Probability Calibration for Knowledge Graph Embedding Models

专知会员服务

36+阅读 · 2020年5月11日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

【泡泡一分钟】基于机器人的视觉惯性里程计（IROS2018-10）

【泡泡一分钟】基于机器人的视觉惯性里程计（IROS2018-10）

泡泡机器人SLAM

13+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

融合视觉多信息的网络化控制系统研究

国家自然科学基金

1+阅读 · 2014年12月31日

Nrf2在砷暴露致胰岛β细胞内质网应激及细胞损伤中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

原位自生纳米准晶增强Mg-Zn-Er合金超塑性研究

国家自然科学基金

0+阅读 · 2013年12月31日

PPIB蛋白影响结直肠癌细胞凋亡的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

多晶块材中“特异”晶粒的3D实验观测及行为规律研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRB3参与脂毒性诱导胰岛β细胞凋亡的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

PFOS暴露致子代大鼠糖代谢异常的内质网应激机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向天气指数作物保险产品的气象灾害损失指数化研究

国家自然科学基金

0+阅读 · 2011年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

Drp-1基因在内质网应激诱导胰岛β32454;胞凋亡中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

Conditional expectation with regularization for missing data imputation

Arxiv

0+阅读 · 2023年5月27日

Joint Mirror Procedure: Controlling False Discovery Rate for Identifying Simultaneous Signals

Arxiv

0+阅读 · 2023年5月27日

BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks

Arxiv

0+阅读 · 2023年5月26日

SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution

Arxiv

0+阅读 · 2023年5月26日

Sequential Bayesian experimental design for calibration of expensive simulation models

Arxiv

0+阅读 · 2023年5月25日

Foundation Models for Decision Making: Problems, Methods, and Opportunities

Arxiv

36+阅读 · 2023年3月7日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Transformers in Medical Imaging: A Survey

Arxiv

15+阅读 · 2022年1月24日

Survey: Transformer based Video-Language Pre-training

Arxiv

20+阅读 · 2021年9月21日

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

Arxiv

16+阅读 · 2021年5月26日

VIP会员

文章信息

相关主题

视觉Transformer

相关VIP内容

【决策Transformers 导论】Introducing Decision Transformers on Hugging Face 🤗

【决策Transformers 导论】Introducing Decision Transformers on Hugging Face 🤗

专知会员服务

67+阅读 · 2022年3月29日

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

【CVPR 2022】基于实例深度估计的统一深度感知全景分割 PanopticDepth: Per-Instance Depth Estimation for Unified Depth-Aware Panoptic Segmentation

专知会员服务

18+阅读 · 2022年3月19日

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

17+阅读 · 2022年3月19日

ICCV 2021最佳论文出炉！微软Swin Transformer摘得马尔奖

ICCV 2021最佳论文出炉！微软Swin Transformer摘得马尔奖

专知会员服务

30+阅读 · 2021年10月13日

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

【斯坦福新书】决策算法，464页pdf，Algorithms for Decision Making

专知会员服务

124+阅读 · 2020年12月7日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

知识图谱嵌入模型的概率标定,Probability Calibration for Knowledge Graph Embedding Models

专知会员服务

36+阅读 · 2020年5月11日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

【泡泡一分钟】基于机器人的视觉惯性里程计（IROS2018-10）

【泡泡一分钟】基于机器人的视觉惯性里程计（IROS2018-10）

泡泡机器人SLAM

13+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

相关论文

Conditional expectation with regularization for missing data imputation

Arxiv

0+阅读 · 2023年5月27日

Joint Mirror Procedure: Controlling False Discovery Rate for Identifying Simultaneous Signals

Arxiv

0+阅读 · 2023年5月27日

BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks

Arxiv

0+阅读 · 2023年5月26日

SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution

Arxiv

0+阅读 · 2023年5月26日

Sequential Bayesian experimental design for calibration of expensive simulation models

Arxiv

0+阅读 · 2023年5月25日

Foundation Models for Decision Making: Problems, Methods, and Opportunities

Arxiv

36+阅读 · 2023年3月7日

Conditional Prompt Learning for Vision-Language Models

Conditional Prompt Learning for Vision-Language Models

Arxiv

13+阅读 · 2022年3月10日

Transformers in Medical Imaging: A Survey

Arxiv

15+阅读 · 2022年1月24日

Survey: Transformer based Video-Language Pre-training

Arxiv

20+阅读 · 2021年9月21日

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

Deep Learning for Weakly-Supervised Object Detection and Object Localization: A Survey

Arxiv

16+阅读 · 2021年5月26日

相关基金

融合视觉多信息的网络化控制系统研究

国家自然科学基金

1+阅读 · 2014年12月31日

Nrf2在砷暴露致胰岛β细胞内质网应激及细胞损伤中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

原位自生纳米准晶增强Mg-Zn-Er合金超塑性研究

国家自然科学基金

0+阅读 · 2013年12月31日

PPIB蛋白影响结直肠癌细胞凋亡的作用及其分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

多晶块材中“特异”晶粒的3D实验观测及行为规律研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRB3参与脂毒性诱导胰岛β细胞凋亡的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

PFOS暴露致子代大鼠糖代谢异常的内质网应激机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

面向天气指数作物保险产品的气象灾害损失指数化研究

国家自然科学基金

0+阅读 · 2011年12月31日

Adiponectin在肝脏缺血再灌注损伤中的抗肝细胞凋亡机制

国家自然科学基金

0+阅读 · 2009年12月31日

Drp-1基因在内质网应激诱导胰岛β32454;胞凋亡中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员