关于没有蒸馏信号的小型自控反竞争模式的功效 (On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals) - 专知论文

会员服务 ·

0

contrastive · MoDELS · 蒸馏 · Performer · 模型性能 ·

2021 年 12 月 13 日

On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals

翻译：关于没有蒸馏信号的小型自控反竞争模式的功效

Haizhou Shi,Youcai Zhang,Siliang Tang,Wenjie Zhu,Yaqian Li,Yandong Guo,Yueting Zhuang

from arxiv, Accepted by AAAI'22

It is a consensus that small models perform quite poorly under the paradigm of self-supervised contrastive learning. Existing methods usually adopt a large off-the-shelf model to transfer knowledge to the small one via distillation. Despite their effectiveness, distillation-based methods may not be suitable for some resource-restricted scenarios due to the huge computational expenses of deploying a large model. In this paper, we study the issue of training self-supervised small models without distillation signals. We first evaluate the representation spaces of the small models and make two non-negligible observations: (i) the small models can complete the pretext task without overfitting despite their limited capacity and (ii) they universally suffer the problem of over clustering. Then we verify multiple assumptions that are considered to alleviate the over-clustering phenomenon. Finally, we combine the validated techniques and improve the baseline performances of five small architectures with considerable margins, which indicates that training small self-supervised contrastive models is feasible even without distillation signals. The code is available at \textit{https://github.com/WOWNICE/ssl-small}.

翻译：一种共识是,小型模型在自我监督的对比性学习范式下表现不佳。现有方法通常采用大型现成模型,通过蒸馏向小模型转让知识。尽管其效果有效,但蒸馏法可能不适用于某些资源受限制的假设情况,因为部署大型模型的计算费用巨大。在本文中,我们研究了在没有蒸馏信号的情况下培训自监督的小模型的问题。我们首先评估了小型模型的展示空间,并提出了两种不可忽略的观察:(一)小型模型尽管能力有限,但可以不过度装配,完成托辞任务;(二)这些小模型普遍受到过度集束问题的影响。然后,我们核实了考虑缓解超集束现象的多种假设。最后,我们结合了经过验证的技术,改进了5个规模小结构的基线性能,但有相当大的边际,这表明即使没有蒸馏信号,培训小型自监督的对比模型也是可行的。该代码可在\ textit{http://github.com/WOWNICE/sl-smal}

3

相关内容

contrastive

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【WSDM2020】小数据学习，124页ppt，Learning with Small Data，宾夕法尼亚州立大学

【WSDM2020】小数据学习，124页ppt，Learning with Small Data，宾夕法尼亚州立大学

专知会员服务

137+阅读 · 2020年2月6日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

本周论文推荐 -- 对抗生成网络、知识图谱补全、对话系统、文本生成

本周论文推荐 -- 对抗生成网络、知识图谱补全、对话系统、文本生成

深度学习自然语言处理

8+阅读 · 2020年1月4日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Learning Disentangled Representation by Exploiting Pretrained Generative Models: A Contrastive Learning View

Arxiv

0+阅读 · 2022年2月14日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Boosting Contrastive Learning with Relation Knowledge Distillation

Arxiv

9+阅读 · 2021年12月8日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Contrastive Embedding for Generalized Zero-Shot Learning

Arxiv

6+阅读 · 2021年3月30日

Self-supervised Learning: Generative or Contrastive

Arxiv

25+阅读 · 2021年3月20日

Time-series Change Point Detection with Self-Supervised Contrastive Predictive Coding

Arxiv

9+阅读 · 2020年11月28日

Contrastive Learning with Hard Negative Samples

Arxiv

7+阅读 · 2020年10月9日

Contrastive Representation Distillation

Contrastive Representation Distillation

Arxiv

5+阅读 · 2019年10月23日

Multi-Task Self-Supervised Learning for Disfluency Detection

Arxiv

5+阅读 · 2019年8月15日

VIP会员

文章信息

相关主题

相关VIP内容

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

19+阅读 · 2020年11月17日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【WSDM2020】小数据学习，124页ppt，Learning with Small Data，宾夕法尼亚州立大学

【WSDM2020】小数据学习，124页ppt，Learning with Small Data，宾夕法尼亚州立大学

专知会员服务

137+阅读 · 2020年2月6日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

数据智能体综述：新兴范式还是被高估的炒作？

海底战已至：美国构思海底安全战略 | 最新报告

【ICCV2025教程】视觉异常检测中的基础模型：进展、挑战与应用

美军将无人自主等新技术融入潜艇部队以更具杀伤力

相关资讯

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知

133+阅读 · 2020年3月18日

本周论文推荐 -- 对抗生成网络、知识图谱补全、对话系统、文本生成

本周论文推荐 -- 对抗生成网络、知识图谱补全、对话系统、文本生成

深度学习自然语言处理

8+阅读 · 2020年1月4日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Learning Disentangled Representation by Exploiting Pretrained Generative Models: A Contrastive Learning View

Arxiv

0+阅读 · 2022年2月14日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Boosting Contrastive Learning with Relation Knowledge Distillation

Arxiv

9+阅读 · 2021年12月8日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Contrastive Embedding for Generalized Zero-Shot Learning

Arxiv

6+阅读 · 2021年3月30日

Self-supervised Learning: Generative or Contrastive

Arxiv

25+阅读 · 2021年3月20日

Time-series Change Point Detection with Self-Supervised Contrastive Predictive Coding

Arxiv

9+阅读 · 2020年11月28日

Contrastive Learning with Hard Negative Samples

Arxiv

7+阅读 · 2020年10月9日

Contrastive Representation Distillation

Contrastive Representation Distillation

Arxiv

5+阅读 · 2019年10月23日

Multi-Task Self-Supervised Learning for Disfluency Detection

Arxiv

5+阅读 · 2019年8月15日

微信扫码咨询专知VIP会员