弱势监督时间行动地方化跨模式共识网络 (Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization) - 专知论文

会员服务 ·

0

INFORMS · Weight · Extensibility · 注意力机制 · Networking ·

2021 年 7 月 27 日

Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization

翻译：弱势监督时间行动地方化跨模式共识网络

Fa-Ting Hong,Jia-Chang Feng,Dan Xu,Ying Shan,Wei-Shi Zheng

from arxiv, ACM International Conference on Multimedia, 2021

Weakly supervised temporal action localization (WS-TAL) is a challenging task that aims to localize action instances in the given video with video-level categorical supervision. Both appearance and motion features are used in previous works, while they do not utilize them in a proper way but apply simple concatenation or score-level fusion. In this work, we argue that the features extracted from the pretrained extractor, e.g., I3D, are not the WS-TALtask-specific features, thus the feature re-calibration is needed for reducing the task-irrelevant information redundancy. Therefore, we propose a cross-modal consensus network (CO2-Net) to tackle this problem. In CO2-Net, we mainly introduce two identical proposed cross-modal consensus modules (CCM) that design a cross-modal attention mechanism to filter out the task-irrelevant information redundancy using the global information from the main modality and the cross-modal local information of the auxiliary modality. Moreover, we treat the attention weights derived from each CCMas the pseudo targets of the attention weights derived from another CCM to maintain the consistency between the predictions derived from two CCMs, forming a mutual learning manner. Finally, we conduct extensive experiments on two common used temporal action localization datasets, THUMOS14 and ActivityNet1.2, to verify our method and achieve the state-of-the-art results. The experimental results show that our proposed cross-modal consensus module can produce more representative features for temporal action localization.

翻译：微弱监管的时间行动本地化(WS-TAL)是一项具有挑战性的任务,旨在将特定视频中的行动场景与视频级的绝对监管实现本地化。在以往的作品中,使用外观和运动功能,虽然它们没有以适当的方式使用它们,而是应用简单的共化或分级级融合。在这项工作中,我们争辩说,从预先培训的提取器(如I3D)中提取的特征不是WS-TALtask特有功能,因此,需要重新校正功能来减少任务相关信息冗余。因此,我们提议在以往的作品中采用跨模式的共识网络(CO2-Net)来解决这一问题。在CO2-Net中,我们主要采用两个相同的拟议的跨模式共识模块(CM),设计一个跨模式关注机制来过滤任务相关信息冗余,例如I3D, 并不是WS-Taltask-Tal-formation-formational 。因此,我们从每个CCMA中得出的关注度的假称目标的权重。从另一个CM-Net模块(CO-CM)的假称的假称目标分级化, 以维持共同的实验行动在最后的两种方法上产生的结果。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【WWW2021】RetaGNN:面向整体序列推荐的关系时态注意力图神经网络

【WWW2021】RetaGNN:面向整体序列推荐的关系时态注意力图神经网络

专知会员服务

32+阅读 · 2021年2月2日

【AAAI2021】时间关系建模与自监督的动作分割

【AAAI2021】时间关系建模与自监督的动作分割

专知会员服务

37+阅读 · 2021年1月24日

【AAAI2020】CompFeat:用于视频实例分割的综合特征聚合

专知会员服务

9+阅读 · 2020年12月10日

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

专知会员服务

84+阅读 · 2020年11月25日

【NeurIPS 2020】对比学习全局和局部医学图像分割特征

【NeurIPS 2020】对比学习全局和局部医学图像分割特征

专知会员服务

44+阅读 · 2020年10月20日

最新《深度学习视频超分》综述论文，30页pdf，Video Super Resolution Based on Deep Learning: A comprehensive survey

最新《深度学习视频超分》综述论文，30页pdf，Video Super Resolution Based on Deep Learning: A comprehensive survey

专知会员服务

24+阅读 · 2020年7月28日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

【CVPR2020-Oral】自监督单目场景流量估计，Self-Supervised Monocular SFE

【CVPR2020-Oral】自监督单目场景流量估计，Self-Supervised Monocular SFE

专知会员服务

23+阅读 · 2020年4月9日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

内涵网络嵌入：Content-rich Network Embedding

内涵网络嵌入：Content-rich Network Embedding

我爱读PAMI

4+阅读 · 2019年11月5日

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

弱监督语义分割最新方法资源列表

弱监督语义分割最新方法资源列表

专知

9+阅读 · 2019年2月26日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

DemiNet: Dependency-Aware Multi-Interest Network with Self-Supervised Graph Learning for Click-Through Rate Prediction

Arxiv

0+阅读 · 2021年9月26日

Self-supervised Learning for Semi-supervised Temporal Language Grounding

Self-supervised Learning for Semi-supervised Temporal Language Grounding

Arxiv

0+阅读 · 2021年9月23日

Cross Attention-guided Dense Network for Images Fusion

Cross Attention-guided Dense Network for Images Fusion

Arxiv

1+阅读 · 2021年9月23日

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Arxiv

5+阅读 · 2021年3月24日

Modeling Multi-Label Action Dependencies for Temporal Action Localization

Arxiv

3+阅读 · 2021年3月4日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation

Arxiv

6+阅读 · 2020年3月18日

Dual Temporal Memory Network for Efficient Video Object Segmentation

Dual Temporal Memory Network for Efficient Video Object Segmentation

Arxiv

5+阅读 · 2020年3月13日

Towards Precise End-to-end Weakly Supervised Object Detection Network

Towards Precise End-to-end Weakly Supervised Object Detection Network

Arxiv

4+阅读 · 2019年11月27日

Graph Convolutional Networks for Temporal Action Localization

Arxiv

5+阅读 · 2019年9月7日

VIP会员

文章信息

相关主题

注意力机制

相关VIP内容

【WWW2021】RetaGNN:面向整体序列推荐的关系时态注意力图神经网络

【WWW2021】RetaGNN:面向整体序列推荐的关系时态注意力图神经网络

专知会员服务

32+阅读 · 2021年2月2日

【AAAI2021】时间关系建模与自监督的动作分割

【AAAI2021】时间关系建模与自监督的动作分割

专知会员服务

37+阅读 · 2021年1月24日

【AAAI2020】CompFeat:用于视频实例分割的综合特征聚合

专知会员服务

9+阅读 · 2020年12月10日

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

专知会员服务

84+阅读 · 2020年11月25日

【NeurIPS 2020】对比学习全局和局部医学图像分割特征

【NeurIPS 2020】对比学习全局和局部医学图像分割特征

专知会员服务

44+阅读 · 2020年10月20日

最新《深度学习视频超分》综述论文，30页pdf，Video Super Resolution Based on Deep Learning: A comprehensive survey

最新《深度学习视频超分》综述论文，30页pdf，Video Super Resolution Based on Deep Learning: A comprehensive survey

专知会员服务

24+阅读 · 2020年7月28日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

【CVPR2020-Oral】自监督单目场景流量估计，Self-Supervised Monocular SFE

【CVPR2020-Oral】自监督单目场景流量估计，Self-Supervised Monocular SFE

专知会员服务

23+阅读 · 2020年4月9日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

7500字 | 《实现压倒性优势：美陆军人工智能未来图景》附原文

中文版2500字 | 美陆军“项目融合”计划：推动大规模作战行动中的目标定位革新（附原文）

4300字《创新防御：生成式人工智能在美国军事演进中的角色》附原文

《重新思考军事战略：战斗消耗中暴露度研究》258页博士论文

相关资讯

内涵网络嵌入：Content-rich Network Embedding

内涵网络嵌入：Content-rich Network Embedding

我爱读PAMI

4+阅读 · 2019年11月5日

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

弱监督语义分割最新方法资源列表

弱监督语义分割最新方法资源列表

专知

9+阅读 · 2019年2月26日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

专知

31+阅读 · 2018年6月4日

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

视频超分辨 Detail-revealing Deep Video Super-resolution 论文笔记

统计学习与视觉计算组

17+阅读 · 2018年3月16日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

DemiNet: Dependency-Aware Multi-Interest Network with Self-Supervised Graph Learning for Click-Through Rate Prediction

Arxiv

0+阅读 · 2021年9月26日

Self-supervised Learning for Semi-supervised Temporal Language Grounding

Self-supervised Learning for Semi-supervised Temporal Language Grounding

Arxiv

0+阅读 · 2021年9月23日

Cross Attention-guided Dense Network for Images Fusion

Cross Attention-guided Dense Network for Images Fusion

Arxiv

1+阅读 · 2021年9月23日

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

Arxiv

5+阅读 · 2021年3月24日

Modeling Multi-Label Action Dependencies for Temporal Action Localization

Arxiv

3+阅读 · 2021年3月4日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation

Arxiv

6+阅读 · 2020年3月18日

Dual Temporal Memory Network for Efficient Video Object Segmentation

Dual Temporal Memory Network for Efficient Video Object Segmentation

Arxiv

5+阅读 · 2020年3月13日

Towards Precise End-to-end Weakly Supervised Object Detection Network

Towards Precise End-to-end Weakly Supervised Object Detection Network

Arxiv

4+阅读 · 2019年11月27日

Graph Convolutional Networks for Temporal Action Localization

Arxiv

5+阅读 · 2019年9月7日

微信扫码咨询专知VIP会员