视觉Transformer中保持局部性以用于类增量学习 (Preserving Locality in Vision Transformers for Class Incremental Learning) - 专知论文

会员服务 ·

0

类增量学习 · 增量学习 · 局部性 · 视觉Transformer · Transformer ·

2023 年 4 月 14 日

Preserving Locality in Vision Transformers for Class Incremental Learning

翻译：视觉Transformer中保持局部性以用于类增量学习

Bowen Zheng,Da-Wei Zhou,Han-Jia Ye,De-Chuan Zhan

Learning new classes without forgetting is crucial for real-world applications for a classification model. Vision Transformers (ViT) recently achieve remarkable performance in Class Incremental Learning (CIL). Previous works mainly focus on block design and model expansion for ViTs. However, in this paper, we find that when the ViT is incrementally trained, the attention layers gradually lose concentration on local features. We call this interesting phenomenon as \emph{Locality Degradation} in ViTs for CIL. Since the low-level local information is crucial to the transferability of the representation, it is beneficial to preserve the locality in attention layers. In this paper, we encourage the model to preserve more local information as the training procedure goes on and devise a Locality-Preserved Attention (LPA) layer to emphasize the importance of local features. Specifically, we incorporate the local information directly into the vanilla attention and control the initial gradients of the vanilla attention by weighting it with a small initial value. Extensive experiments show that the representations facilitated by LPA capture more low-level general information which is easier to transfer to follow-up tasks. The improved model gets consistently better performance on CIFAR100 and ImageNet100.

翻译：在分类模型中，无遗忘地学习新类别对于实际应用十分重要。视觉Transformer（ViT）最近在类增量学习中取得了卓越的表现。以往的工作主要集中在ViT的块设计和模型扩展上。然而，在本文中，我们发现，当增量训练ViT时，注意力层逐渐丧失了对于局部特征的集中能力。我们称这一有趣的现象为ViT用于类增量学习的“局部退化”。由于低层次的局部信息对于表示的可迁移性至关重要，因此保留注意力层中的局部性是有益的。在本文中，我们鼓励模型在训练过程中保留更多的本地信息，并设计了一个保持局部信息的注意力（LPA）层以强调本地特征的重要性。具体来说，我们直接将局部信息纳入到原始注意力中，并通过将其与较小的初始值加权来控制原始注意力的初始梯度。广泛的实验证明，通过LPA促进的表示捕捉到更多的低级通用信息，这更容易转移到后续任务中。改进的模型在CIFAR100和ImageNet100上始终获得更好的性能。

0

相关内容

类增量学习

类增量学习

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

专知会员服务

13+阅读 · 2022年3月19日

【AAAI2021】克服图神经网络灾难性遗忘，Overcoming Catastrophic Forgetting in GNN

【AAAI2021】克服图神经网络灾难性遗忘，Overcoming Catastrophic Forgetting in GNN

专知会员服务

18+阅读 · 2020年12月15日

【CVPR 2020 Oral】小样本类增量学习

专知会员服务

112+阅读 · 2020年6月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【三星AI-CVPR2020】增量小样本目标检测，Incremental Few-Shot Object Detection

专知会员服务

69+阅读 · 2020年3月11日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

重磅！Geoffrey Hinton新论文「视觉表示对比学习简单框架」自监督学习建立新SOTA-ImageNet准确率76.5%

重磅！Geoffrey Hinton新论文「视觉表示对比学习简单框架」自监督学习建立新SOTA-ImageNet准确率76.5%

专知会员服务

33+阅读 · 2020年2月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

【CVPR 2020 Oral】小样本类增量学习

【CVPR 2020 Oral】小样本类增量学习

专知

20+阅读 · 2020年6月26日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【泡泡一分钟】使用深度神经网络提取局部特征的大规模图像检索算法(ICCV-2)

【泡泡一分钟】使用深度神经网络提取局部特征的大规模图像检索算法(ICCV-2)

泡泡机器人SLAM

16+阅读 · 2018年2月10日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

黎曼流形上 Ricci 曲率的几何

国家自然科学基金

3+阅读 · 2015年12月31日

AG-WUS-PcG-lncRNA互作对梅多雌蕊发育的调控

国家自然科学基金

0+阅读 · 2015年12月31日

硅酸盐气化与凝聚过程的相平衡与化学分异

国家自然科学基金

0+阅读 · 2014年12月31日

大白菜KIN基因的表达及其pre-mRNA加工机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于复合分位数回归和最大秩相关想法的ROC回归曲线估计

国家自然科学基金

0+阅读 · 2013年12月31日

鲁棒几何结构描述及图像识别

国家自然科学基金

1+阅读 · 2012年12月31日

社会认知中信念冲突的认知神经机制及其计算模型

国家自然科学基金

6+阅读 · 2012年12月31日

mu基理论及其在计算几何中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

Cayley图的匹配可扩性和semi-Cayley图的谱

国家自然科学基金

0+阅读 · 2011年12月31日

基于无冲突集的约束Job Shop调度优化算法

国家自然科学基金

0+阅读 · 2009年12月31日

Normalization Enhances Generalization in Visual Reinforcement Learning

Arxiv

0+阅读 · 2023年6月1日

Predicting Temporal Aspects of Movement for Predictive Replication in Fog Environments

Arxiv

0+阅读 · 2023年6月1日

Learning without Forgetting for Vision-Language Models

Arxiv

0+阅读 · 2023年5月30日

Prediction Error-based Classification for Class-Incremental Learning

Arxiv

0+阅读 · 2023年5月30日

Privileged Knowledge Distillation for Sim-to-Real Policy Generalization

Arxiv

0+阅读 · 2023年5月29日

DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes

Arxiv

0+阅读 · 2023年5月29日

Deep Class-Incremental Learning: A Survey

Arxiv

13+阅读 · 2023年2月7日

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Arxiv

13+阅读 · 2022年3月29日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

VIP会员

文章信息

相关主题

类增量学习

视觉Transformer

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

专知会员服务

13+阅读 · 2022年3月19日

【AAAI2021】克服图神经网络灾难性遗忘，Overcoming Catastrophic Forgetting in GNN

【AAAI2021】克服图神经网络灾难性遗忘，Overcoming Catastrophic Forgetting in GNN

专知会员服务

18+阅读 · 2020年12月15日

【CVPR 2020 Oral】小样本类增量学习

专知会员服务

112+阅读 · 2020年6月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【三星AI-CVPR2020】增量小样本目标检测，Incremental Few-Shot Object Detection

专知会员服务

69+阅读 · 2020年3月11日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

重磅！Geoffrey Hinton新论文「视觉表示对比学习简单框架」自监督学习建立新SOTA-ImageNet准确率76.5%

重磅！Geoffrey Hinton新论文「视觉表示对比学习简单框架」自监督学习建立新SOTA-ImageNet准确率76.5%

专知会员服务

33+阅读 · 2020年2月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

【CVPR 2020 Oral】小样本类增量学习

【CVPR 2020 Oral】小样本类增量学习

专知

20+阅读 · 2020年6月26日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【泡泡一分钟】使用深度神经网络提取局部特征的大规模图像检索算法(ICCV-2)

【泡泡一分钟】使用深度神经网络提取局部特征的大规模图像检索算法(ICCV-2)

泡泡机器人SLAM

16+阅读 · 2018年2月10日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Normalization Enhances Generalization in Visual Reinforcement Learning

Arxiv

0+阅读 · 2023年6月1日

Predicting Temporal Aspects of Movement for Predictive Replication in Fog Environments

Arxiv

0+阅读 · 2023年6月1日

Learning without Forgetting for Vision-Language Models

Arxiv

0+阅读 · 2023年5月30日

Prediction Error-based Classification for Class-Incremental Learning

Arxiv

0+阅读 · 2023年5月30日

Privileged Knowledge Distillation for Sim-to-Real Policy Generalization

Arxiv

0+阅读 · 2023年5月29日

DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes

Arxiv

0+阅读 · 2023年5月29日

Deep Class-Incremental Learning: A Survey

Arxiv

13+阅读 · 2023年2月7日

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Arxiv

13+阅读 · 2022年3月29日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

相关基金

黎曼流形上 Ricci 曲率的几何

国家自然科学基金

3+阅读 · 2015年12月31日

AG-WUS-PcG-lncRNA互作对梅多雌蕊发育的调控

国家自然科学基金

0+阅读 · 2015年12月31日

硅酸盐气化与凝聚过程的相平衡与化学分异

国家自然科学基金

0+阅读 · 2014年12月31日

大白菜KIN基因的表达及其pre-mRNA加工机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于复合分位数回归和最大秩相关想法的ROC回归曲线估计

国家自然科学基金

0+阅读 · 2013年12月31日

鲁棒几何结构描述及图像识别

国家自然科学基金

1+阅读 · 2012年12月31日

社会认知中信念冲突的认知神经机制及其计算模型

国家自然科学基金

6+阅读 · 2012年12月31日

mu基理论及其在计算几何中的应用

国家自然科学基金

0+阅读 · 2012年12月31日

Cayley图的匹配可扩性和semi-Cayley图的谱

国家自然科学基金

0+阅读 · 2011年12月31日

基于无冲突集的约束Job Shop调度优化算法

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员