XKD: 跨模式知识蒸馏,与视频代表学习域对齐 (XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning) - 专知论文

会员服务 ·

0

知识 (knowledge) · Learning · 蒸馏 · Performer · 可辨认的 ·

2022 年 12 月 12 日

XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning

翻译：XKD: 跨模式知识蒸馏,与视频代表学习域对齐

Pritam Sarkar,Ali Etemad

We present XKD, a novel self-supervised framework to learn meaningful representations from unlabelled video clips. XKD is trained with two pseudo tasks. First, masked data reconstruction is performed to learn modality-specific representations. Next, self-supervised cross-modal knowledge distillation is performed between the two modalities through teacher-student setups to learn complementary information. To identify the most effective information to transfer and also to tackle the domain gap between audio and visual modalities which could hinder knowledge transfer, we introduce a domain alignment strategy for effective cross-modal distillation. Lastly, to develop a general-purpose solution capable of handling both audio and visual streams, a modality-agnostic variant of our proposed framework is introduced, which uses the same backbone for both audio and visual modalities. Our proposed cross-modal knowledge distillation improves linear evaluation top-1 accuracy of video action classification by 8.4% on UCF101, 8.1% on HMDB51, 13.8% on Kinetics-Sound, and 14.2% on Kinetics400. Additionally, our modality-agnostic variant shows promising results in developing a general-purpose network capable of handling different data streams. The code is released on the project website.

翻译：我们提出XKD,这是一个全新的自我监督框架,目的是从未贴标签的视频剪辑中学习有意义的代表。 XKD是经过培训的,有两种假任务。首先,进行蒙面数据重建,以学习特定模式的代表。接下来,在两种模式之间进行自我监督的跨模式知识蒸馏,通过师生设置进行自我监督的跨模式知识蒸馏,以学习补充信息。为了确定最有效的信息,以便转让,并解决可能阻碍知识转让的视听模式之间的领域差距,我们为有效的跨式蒸馏引入了域协调战略。最后,为了开发一种通用解决方案,能够同时处理视听流,我们拟议框架的一种模式-不可知变异模式被引入,同时使用相同的主干体,用于视听模式。我们提议的跨模式知识蒸馏将视频行动分类的线性评价第一至第一准确度提高8.4%,用于铀转化系数101,用于HMDB51的8.1%,用于营养-声音的13.8%,以及用于营养学的14.2%。此外,我们的模式-基因变异变量展示了能够处理不同数据流的网站。

0

相关内容

知识 (knowledge)

知识 (knowledge)

通过学习、实践或探索所获得的认识、判断或技能。

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

专知会员服务

40+阅读 · 2022年7月25日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

专知会员服务

13+阅读 · 2022年3月19日

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

专知会员服务

10+阅读 · 2022年3月19日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

专知

50+阅读 · 2018年4月25日

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

基于BODIPY的光活化探针及其在超高分辨荧光成像中的研究

国家自然科学基金

0+阅读 · 2015年12月31日

HIF-1调控Galectin-1与S1PR1-STAT3信号轴对话并诱导胃癌特异性肝转移的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

高效长寿命量子存储

国家自然科学基金

0+阅读 · 2013年12月31日

基于能量输配的煤泥分选过程强化机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

具有AIE特性的吡咯并吡咯二酮近红外共轭聚电解质的合成及其在肿瘤细胞靶向荧光成像中的应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

SnO2@CNT气体敏感材料的结构设计合成与敏感机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

金属氧化物有序纳米结构阵列在染料电池中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

基于RGD配体设计的CdTe量子点靶向探针的直接合成及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

具有a-glucosidase抑制活性的新型穿心莲内酯衍生物抗细胞黏附和血管生成作用及机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Robust Representation Learning with Self-Distillation for Domain Generalization

Arxiv

0+阅读 · 2023年2月14日

Learning Aligned Cross-modal Representations for Referring Image Segmentation

Arxiv

0+阅读 · 2023年2月10日

Cross-Modal Discrete Representation Learning

Arxiv

18+阅读 · 2021年6月10日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

Time-Series Event Prediction with Evolutionary State Graph

Arxiv

14+阅读 · 2020年11月25日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Learning Embedding Adaptation for Few-Shot Learning

Learning Embedding Adaptation for Few-Shot Learning

Arxiv

17+阅读 · 2018年12月10日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

【图机器学习进展与趋势@ICML2022】Graph Machine Learning @ ICML 2022

专知会员服务

40+阅读 · 2022年7月25日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

专知会员服务

13+阅读 · 2022年3月19日

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

【CVPR 2022】基于粗粒度和细粒度特征匹配的视频描述评估，EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

专知会员服务

10+阅读 · 2022年3月19日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ExBert — 可视化分析Transformer学到的表示

ExBert — 可视化分析Transformer学到的表示

专知会员服务

32+阅读 · 2019年10月16日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

不确定环境下无人机三维路径规划研究 | 221页

远征作战军事后勤规划

大语言模型将如何改变军事指挥结构

美陆军能力集成与开发系统（ACIDS）流程指南 | 2025最新122页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐

专知

50+阅读 · 2018年4月25日

相关论文

Robust Representation Learning with Self-Distillation for Domain Generalization

Arxiv

0+阅读 · 2023年2月14日

Learning Aligned Cross-modal Representations for Referring Image Segmentation

Arxiv

0+阅读 · 2023年2月10日

Cross-Modal Discrete Representation Learning

Arxiv

18+阅读 · 2021年6月10日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

Time-Series Event Prediction with Evolutionary State Graph

Arxiv

14+阅读 · 2020年11月25日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Arxiv

19+阅读 · 2020年3月31日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Learning Embedding Adaptation for Few-Shot Learning

Learning Embedding Adaptation for Few-Shot Learning

Arxiv

17+阅读 · 2018年12月10日

Learning over Knowledge-Base Embeddings for Recommendation

Arxiv

23+阅读 · 2018年3月22日

相关基金

MARVELD1基因调控肝细胞癌介入治疗的机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

基于BODIPY的光活化探针及其在超高分辨荧光成像中的研究

国家自然科学基金

0+阅读 · 2015年12月31日

HIF-1调控Galectin-1与S1PR1-STAT3信号轴对话并诱导胃癌特异性肝转移的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

高效长寿命量子存储

国家自然科学基金

0+阅读 · 2013年12月31日

基于能量输配的煤泥分选过程强化机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

具有AIE特性的吡咯并吡咯二酮近红外共轭聚电解质的合成及其在肿瘤细胞靶向荧光成像中的应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

SnO2@CNT气体敏感材料的结构设计合成与敏感机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

金属氧化物有序纳米结构阵列在染料电池中的应用

国家自然科学基金

0+阅读 · 2011年12月31日

基于RGD配体设计的CdTe量子点靶向探针的直接合成及其应用

国家自然科学基金

0+阅读 · 2009年12月31日

具有a-glucosidase抑制活性的新型穿心莲内酯衍生物抗细胞黏附和血管生成作用及机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员