MoLo: 基于动作增强的长短对比学习用于少样本动作识别 (MoLo: Motion-augmented Long-short Contrastive Learning for Few-shot Action Recognition) - 专知论文

会员服务 ·

0

动作识别 · 样本 · 自编码器 · 对比学习 · 上下文 ·

2023 年 4 月 3 日

MoLo: Motion-augmented Long-short Contrastive Learning for Few-shot Action Recognition

翻译：MoLo: 基于动作增强的长短对比学习用于少样本动作识别

Xiang Wang,Shiwei Zhang,Zhiwu Qing,Changxin Gao,Yingya Zhang,Deli Zhao,Nong Sang

from arxiv, Accepted by CVPR-2023. Code: https://github.com/alibaba-mmai-research/MoLo

Current state-of-the-art approaches for few-shot action recognition achieve promising performance by conducting frame-level matching on learned visual features. However, they generally suffer from two limitations: i) the matching procedure between local frames tends to be inaccurate due to the lack of guidance to force long-range temporal perception; ii) explicit motion learning is usually ignored, leading to partial information loss. To address these issues, we develop a Motion-augmented Long-short Contrastive Learning (MoLo) method that contains two crucial components, including a long-short contrastive objective and a motion autodecoder. Specifically, the long-short contrastive objective is to endow local frame features with long-form temporal awareness by maximizing their agreement with the global token of videos belonging to the same class. The motion autodecoder is a lightweight architecture to reconstruct pixel motions from the differential features, which explicitly embeds the network with motion dynamics. By this means, MoLo can simultaneously learn long-range temporal context and motion cues for comprehensive few-shot matching. To demonstrate the effectiveness, we evaluate MoLo on five standard benchmarks, and the results show that MoLo favorably outperforms recent advanced methods. The source code is available at https://github.com/alibaba-mmai-research/MoLo.

翻译：当前，针对少样本动作识别问题，最先进的方法通过对学习的视觉特征进行帧级匹配，取得了良好的性能。但是，它们通常存在两个限制：i）由于缺乏引导来强制长时间范围内的时间感知，局部帧之间的匹配过程往往不准确；ii）通常忽略了明确的动作学习，导致部分信息丢失。为了解决这些问题，我们开发了一种基于动作增强的长短对比学习（MoLo）方法，该方法包含两个关键组件，包括长短对比目标和运动自编码器。具体来说，长短对比目标是通过最大化同一类别视频的全球令牌与局部帧特征之间的一致性，为局部帧特征赋予长时间范围上下文意识。运动自编码器是一种轻量级的体系结构，用于从差分特征中重建像素运动，其显式地将网络与运动动力学嵌入其中。通过这种方式，MoLo可以同时学习长时间范围上下文和运动线索，以实现全面的少样本匹配。为了证明其有效性，我们在五个标准基准测试中评估了MoLo的性能，结果显示，MoLo优于最近先进的方法。源代码可在 https://github.com/alibaba-mmai-research/MoLo 获取。

0

相关内容

动作识别

【CVPR 2022】基于时空解耦与重耦的RGB-D动作识别 Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

【CVPR 2022】基于时空解耦与重耦的RGB-D动作识别 Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

专知会员服务

14+阅读 · 2022年3月19日

近期必读的五篇AAAI 2021【视频理解】相关论文和代码

专知会员服务

51+阅读 · 2021年1月19日

近期必读的七篇NeurIPS 2020【对比学习】相关论文和代码

近期必读的七篇NeurIPS 2020【对比学习】相关论文和代码

专知会员服务

66+阅读 · 2020年10月20日

近期必读的六篇 ICML 2020【元学习（Meta Learning）】相关论文

近期必读的六篇 ICML 2020【元学习（Meta Learning）】相关论文

专知会员服务

45+阅读 · 2020年9月25日

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

专知会员服务

25+阅读 · 2020年5月22日

近期必读的6篇CVPR 2020【域自适应（Domain Adaptation）】相关论文和代码

近期必读的6篇CVPR 2020【域自适应（Domain Adaptation）】相关论文和代码

专知会员服务

96+阅读 · 2020年3月24日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

近期必读的9篇CVPR 2019【域自适应（Domain Adaptation）】相关论文和代码

近期必读的9篇CVPR 2019【域自适应（Domain Adaptation）】相关论文和代码

专知会员服务

62+阅读 · 2020年1月10日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

专知

25+阅读 · 2018年5月28日

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

专知

10+阅读 · 2018年4月12日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文推荐】最新六篇图像描述生成相关论文—视频摘要、注意力张量积、非自回归神经序列模型、副词识别、多主体、多样性度量

【论文推荐】最新六篇图像描述生成相关论文—视频摘要、注意力张量积、非自回归神经序列模型、副词识别、多主体、多样性度量

专知

10+阅读 · 2018年3月2日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

荧光/磁性纳米球用于循环肿瘤细胞的高灵敏检测

国家自然科学基金

0+阅读 · 2015年12月31日

双链嵌合microRNA用于治疗肝细胞癌的研究

国家自然科学基金

0+阅读 · 2015年12月31日

AAV-p65shRNA和AAV-BMP4联合应用抑制早期骨性关节炎软骨细胞退变的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

竞争性内源RNA分子网络调控乳腺肿瘤细胞恶性表型及治疗耐受的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于贝叶斯联合模型的皮层脑机接口实现: 动作电位的实时检测、分类和解码

国家自然科学基金

0+阅读 · 2013年12月31日

基于知识迁移的跨领域人体动作识别

国家自然科学基金

5+阅读 · 2013年12月31日

面向RGB-D视频的人体动作识别研究

国家自然科学基金

0+阅读 · 2012年12月31日

抑制炎症介质增强MSC修复炎症性肠病粘膜损伤的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于耦合判别和协作稀疏表示的图像表征和标注研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于二维随机映射和一范数优化的有监督图像分类研究

国家自然科学基金

3+阅读 · 2011年12月31日

Multi-Modal Mutual Attention and Iterative Interaction for Referring Image Segmentation

Arxiv

0+阅读 · 2023年5月24日

MetAug: Contrastive Learning via Meta Feature Augmentation

Arxiv

10+阅读 · 2022年3月10日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Graph Enhanced Representation Learning for News Recommendation

Arxiv

24+阅读 · 2020年3月31日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

26+阅读 · 2020年3月13日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR 2022】基于时空解耦与重耦的RGB-D动作识别 Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

【CVPR 2022】基于时空解耦与重耦的RGB-D动作识别 Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

专知会员服务

14+阅读 · 2022年3月19日

近期必读的五篇AAAI 2021【视频理解】相关论文和代码

专知会员服务

51+阅读 · 2021年1月19日

近期必读的七篇NeurIPS 2020【对比学习】相关论文和代码

近期必读的七篇NeurIPS 2020【对比学习】相关论文和代码

专知会员服务

66+阅读 · 2020年10月20日

近期必读的六篇 ICML 2020【元学习（Meta Learning）】相关论文

近期必读的六篇 ICML 2020【元学习（Meta Learning）】相关论文

专知会员服务

45+阅读 · 2020年9月25日

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

【CVPR2020】语义增强的场景文本识别的编码-解码器框架，SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

专知会员服务

25+阅读 · 2020年5月22日

近期必读的6篇CVPR 2020【域自适应（Domain Adaptation）】相关论文和代码

近期必读的6篇CVPR 2020【域自适应（Domain Adaptation）】相关论文和代码

专知会员服务

96+阅读 · 2020年3月24日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

近期必读的9篇CVPR 2019【域自适应（Domain Adaptation）】相关论文和代码

近期必读的9篇CVPR 2019【域自适应（Domain Adaptation）】相关论文和代码

专知会员服务

62+阅读 · 2020年1月10日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

【论文推荐】最新七篇图像描述生成相关论文—CNN+CNN、对抗样本、显著性和上下文注意力、条件生成对抗网络、风格化

专知

25+阅读 · 2018年5月28日

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索

专知

10+阅读 · 2018年4月12日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

【论文推荐】最新5篇行人再识别（ReID）相关论文—迁移学习、特征集成、重排序、多通道金字塔、深层生成模型

专知

12+阅读 · 2018年3月24日

【论文推荐】最新六篇图像描述生成相关论文—视频摘要、注意力张量积、非自回归神经序列模型、副词识别、多主体、多样性度量

【论文推荐】最新六篇图像描述生成相关论文—视频摘要、注意力张量积、非自回归神经序列模型、副词识别、多主体、多样性度量

专知

10+阅读 · 2018年3月2日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

相关论文

Multi-Modal Mutual Attention and Iterative Interaction for Referring Image Segmentation

Arxiv

0+阅读 · 2023年5月24日

MetAug: Contrastive Learning via Meta Feature Augmentation

Arxiv

10+阅读 · 2022年3月10日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

A Survey on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification

Arxiv

16+阅读 · 2021年7月7日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Graph Enhanced Representation Learning for News Recommendation

Arxiv

24+阅读 · 2020年3月31日

A Survey on Deep Learning for Named Entity Recognition

A Survey on Deep Learning for Named Entity Recognition

Arxiv

26+阅读 · 2020年3月13日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

Deep Active Learning for Named Entity Recognition

Arxiv

15+阅读 · 2018年2月4日

相关基金

荧光/磁性纳米球用于循环肿瘤细胞的高灵敏检测

国家自然科学基金

0+阅读 · 2015年12月31日

双链嵌合microRNA用于治疗肝细胞癌的研究

国家自然科学基金

0+阅读 · 2015年12月31日

AAV-p65shRNA和AAV-BMP4联合应用抑制早期骨性关节炎软骨细胞退变的实验研究

国家自然科学基金

0+阅读 · 2014年12月31日

竞争性内源RNA分子网络调控乳腺肿瘤细胞恶性表型及治疗耐受的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于贝叶斯联合模型的皮层脑机接口实现: 动作电位的实时检测、分类和解码

国家自然科学基金

0+阅读 · 2013年12月31日

基于知识迁移的跨领域人体动作识别

国家自然科学基金

5+阅读 · 2013年12月31日

面向RGB-D视频的人体动作识别研究

国家自然科学基金

0+阅读 · 2012年12月31日

抑制炎症介质增强MSC修复炎症性肠病粘膜损伤的研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于耦合判别和协作稀疏表示的图像表征和标注研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于二维随机映射和一范数优化的有监督图像分类研究

国家自然科学基金

3+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员