视觉DIFFMASK: 利用不可导裁剪使视觉Transformer的解释更准确 (VISION DIFFMASK: Faithful Interpretation of Vision Transformers with Differentiable Patch Masking) - 专知论文

会员服务 ·

0

视觉Transformer · 掩码 · Vision · 门控机制 · Transformer ·

2023 年 4 月 13 日

VISION DIFFMASK: Faithful Interpretation of Vision Transformers with Differentiable Patch Masking

翻译：视觉DIFFMASK: 利用不可导裁剪使视觉Transformer的解释更准确

Angelos Nalmpantis,Apostolos Panagiotopoulos,John Gkountouras,Konstantinos Papakostas,Wilker Aziz

from arxiv, Accepted in the XAI4CV Workshop at CVPR 2023

The lack of interpretability of the Vision Transformer may hinder its use in critical real-world applications despite its effectiveness. To overcome this issue, we propose a post-hoc interpretability method called VISION DIFFMASK, which uses the activations of the model's hidden layers to predict the relevant parts of the input that contribute to its final predictions. Our approach uses a gating mechanism to identify the minimal subset of the original input that preserves the predicted distribution over classes. We demonstrate the faithfulness of our method, by introducing a faithfulness task, and comparing it to other state-of-the-art attribution methods on CIFAR-10 and ImageNet-1K, achieving compelling results. To aid reproducibility and further extension of our work, we open source our implementation: https://github.com/AngelosNal/Vision-DiffMask

翻译：缺乏解释能力可能会限制视觉Transformer在重要实际应用上的使用，但其效果卓著。为了解决这个问题，我们提出了一种后验解释方法——VISION DIFFMASK，它利用模型隐藏层的激活来预测对最终预测产生贡献的输入的相关部分。我们的方法使用一个门控机制来确定最小的原始输入子集，以保留预测的类别分布。通过引入一种忠实度任务，并在CIFAR-10和ImageNet-1K上与其他最先进的归因方法进行比较，我们证明了方法的忠实度，取得了令人信服的结果。为了促进我们工作的复现和进一步扩展，我们开源了我们的实现：https://github.com/AngelosNal/Vision-DiffMask

0

相关内容

视觉Transformer

视觉Transformer

IJCAI2022 Oral: 探究和解释图像分类任务中存在的频域偏见

IJCAI2022 Oral: 探究和解释图像分类任务中存在的频域偏见

专知会员服务

13+阅读 · 2022年5月12日

【ICLR2022】GNN-LM基于全局信息的图神经网络语义理解模型

【ICLR2022】GNN-LM基于全局信息的图神经网络语义理解模型

专知会员服务

21+阅读 · 2022年2月12日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【ICML2021】SparseBERT: 自注意力机制的重要性分析再思考

专知会员服务

37+阅读 · 2021年5月15日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【上海交大】可解释CNN的对象分类，Interpretable CNNs for Object Classification

专知会员服务

54+阅读 · 2020年3月14日

《可解释的机器学习-interpretable-ml》238页pdf

《可解释的机器学习-interpretable-ml》238页pdf

专知会员服务

208+阅读 · 2020年2月24日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

专知

15+阅读 · 2018年5月15日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

专知

15+阅读 · 2018年2月3日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

面向生物特征识别的鲁棒判别结构化特征表示方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

WTX通过ARHGDIA/CDC42/PAKs调控细胞骨架稳定性抑制结直肠癌肝转移机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

幼儿汉语口语感知特点及神经机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于上下文协作、多级观测和数据关联的复杂场景多目标跟踪

国家自然科学基金

0+阅读 · 2013年12月31日

面向低质量图像数据的低秩判别迁移子空间特征抽取研究

国家自然科学基金

0+阅读 · 2013年12月31日

高精度超高空间分辨率的LIBS固相同位素测量技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于上下文信息与混合状态估计模型的视觉跟踪及其应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

用于均相FRET检测的稀土纳米荧光标记材料及其发光物理

国家自然科学基金

0+阅读 · 2009年12月31日

PMN-PT单晶的高频压电性能及机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

经络信息检测、信道辨识与特征提取研究

国家自然科学基金

0+阅读 · 2009年12月31日

TPDM: Selectively Removing Positional Information for Zero-shot Translation via Token-Level Position Disentangle Module

Arxiv

0+阅读 · 2023年5月31日

Multi-task Paired Masking with Alignment Modeling for Medical Vision-Language Pre-training

Arxiv

0+阅读 · 2023年5月31日

DiffMatch: Diffusion Model for Dense Matching

Arxiv

0+阅读 · 2023年5月30日

InDL: A New Datasets and Benchmark for In-Diagram Logic Interpreting based on Visual Illusion

Arxiv

0+阅读 · 2023年5月30日

Exploring Self-Attention Mechanisms for Speech Separation

Arxiv

0+阅读 · 2023年5月27日

Posthoc Interpretation via Quantization

Arxiv

0+阅读 · 2023年5月27日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

Counterfactual Zero-Shot and Open-Set Visual Recognition

Arxiv

12+阅读 · 2021年3月1日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

VIP会员

文章信息

相关主题

视觉Transformer

相关VIP内容

IJCAI2022 Oral: 探究和解释图像分类任务中存在的频域偏见

IJCAI2022 Oral: 探究和解释图像分类任务中存在的频域偏见

专知会员服务

13+阅读 · 2022年5月12日

【ICLR2022】GNN-LM基于全局信息的图神经网络语义理解模型

【ICLR2022】GNN-LM基于全局信息的图神经网络语义理解模型

专知会员服务

21+阅读 · 2022年2月12日

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

【ICML2021】SparseBERT: 自注意力机制的重要性分析再思考

专知会员服务

37+阅读 · 2021年5月15日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【上海交大】可解释CNN的对象分类，Interpretable CNNs for Object Classification

专知会员服务

54+阅读 · 2020年3月14日

《可解释的机器学习-interpretable-ml》238页pdf

《可解释的机器学习-interpretable-ml》238页pdf

专知会员服务

208+阅读 · 2020年2月24日

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

【ICCV 2019 Toturial】Interpretable Machine Learning for Computer Vision（用于计算机视觉的可解释性机器学习）

专知会员服务

32+阅读 · 2019年10月30日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

【论文推荐】最新八篇生成对抗网络相关论文—条件翻译、RGB-D动作识别、量子生成对抗网络、语义对齐、视频摘要、视觉-文本注意力

专知

15+阅读 · 2018年5月15日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

专知

15+阅读 · 2018年2月3日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

TPDM: Selectively Removing Positional Information for Zero-shot Translation via Token-Level Position Disentangle Module

Arxiv

0+阅读 · 2023年5月31日

Multi-task Paired Masking with Alignment Modeling for Medical Vision-Language Pre-training

Arxiv

0+阅读 · 2023年5月31日

DiffMatch: Diffusion Model for Dense Matching

Arxiv

0+阅读 · 2023年5月30日

InDL: A New Datasets and Benchmark for In-Diagram Logic Interpreting based on Visual Illusion

Arxiv

0+阅读 · 2023年5月30日

Exploring Self-Attention Mechanisms for Speech Separation

Arxiv

0+阅读 · 2023年5月27日

Posthoc Interpretation via Quantization

Arxiv

0+阅读 · 2023年5月27日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

Counterfactual Zero-Shot and Open-Set Visual Recognition

Arxiv

12+阅读 · 2021年3月1日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

相关基金

面向生物特征识别的鲁棒判别结构化特征表示方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

WTX通过ARHGDIA/CDC42/PAKs调控细胞骨架稳定性抑制结直肠癌肝转移机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

幼儿汉语口语感知特点及神经机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于上下文协作、多级观测和数据关联的复杂场景多目标跟踪

国家自然科学基金

0+阅读 · 2013年12月31日

面向低质量图像数据的低秩判别迁移子空间特征抽取研究

国家自然科学基金

0+阅读 · 2013年12月31日

高精度超高空间分辨率的LIBS固相同位素测量技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于上下文信息与混合状态估计模型的视觉跟踪及其应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

用于均相FRET检测的稀土纳米荧光标记材料及其发光物理

国家自然科学基金

0+阅读 · 2009年12月31日

PMN-PT单晶的高频压电性能及机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

经络信息检测、信道辨识与特征提取研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员