综合行动:在视频中承认人类的强力行动 (IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos) - 专知论文

会员服务 ·

0

Integration · INFORMS · 稳健性 · 流 · Performer ·

2021 年 4 月 15 日

IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos

翻译：综合行动:在视频中承认人类的强力行动

Gyeongsik Moon,Heeseung Kwon,Kyoung Mu Lee,Minsu Cho

from arxiv, Published at CVPRW 2021

Most current action recognition methods heavily rely on appearance information by taking an RGB sequence of entire image regions as input. While being effective in exploiting contextual information around humans, e.g., human appearance and scene category, they are easily fooled by out-of-context action videos where the contexts do not exactly match with target actions. In contrast, pose-based methods, which take a sequence of human skeletons only as input, suffer from inaccurate pose estimation or ambiguity of human pose per se. Integrating these two approaches has turned out to be non-trivial; training a model with both appearance and pose ends up with a strong bias towards appearance and does not generalize well to unseen videos. To address this problem, we propose to learn pose-driven feature integration that dynamically combines appearance and pose streams by observing pose features on the fly. The main idea is to let the pose stream decide how much and which appearance information is used in integration based on whether the given pose information is reliable or not. We show that the proposed IntegralAction achieves highly robust performance across in-context and out-of-context action video datasets. The codes are available in https://github.com/mks0601/IntegralAction_RELEASE.

翻译：目前大多数行动识别方法都在很大程度上依赖外观信息,将整个图像区域中的RGB序列作为投入。虽然它们有效地利用了人类周围的背景信息,例如人类外观和场景类别,但很容易被外相行动视频所愚弄,而环境与目标行动并不完全吻合。相比之下,基于外形的方法,仅将人类骨骼序列作为输入,但仅作为输入而使用不准确的人类骨骼序列,其本身就存在不准确的构成估计或模糊性。将这两种方法结合起来的结果是非三角的;对一个外观模式进行培训,最终对外观产生强烈的偏向,而不会对看不见的视频进行概括化。为了解决这一问题,我们提议学习由外观和流体形成动态组合的外观特征整合,通过在飞体上观测外观特征。主要的想法是让外观流根据所提供的信息是否可靠来决定在整合中使用多少和哪些外观信息。我们发现,拟议的综合行动在相框和外框外动作中都取得了高度有力的性表现,而不是对看不见的视频数据集。我们建议学习由外观组合组合组合组合组合组合组合组合组合而成的特性。ANS/LEO06/LEOmb/GIS。

0

相关内容

Integration

Integration：Integration, the VLSI Journal。 Explanation：集成，VLSI杂志。 Publisher：Elsevier。 SIT：http://dblp.uni-trier.de/db/journals/integration/

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【斯坦福大学】具有共同注意力的对抗性跨域动作识别（Adversarial Cross-Domain Action Recognition with Co-Attention）

【斯坦福大学】具有共同注意力的对抗性跨域动作识别（Adversarial Cross-Domain Action Recognition with Co-Attention）

专知会员服务

38+阅读 · 2019年12月26日

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

专知会员服务

100+阅读 · 2019年11月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

专知会员服务

39+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

车辆目标检测

车辆目标检测

数据挖掘入门与实战

30+阅读 · 2018年3月30日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

Video Instance Segmentation using Inter-Frame Communication Transformers

Arxiv

1+阅读 · 2021年6月7日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation

Arxiv

6+阅读 · 2020年3月18日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Simultaneous Segmentation and Recognition: Towards more accurate Ego Gesture Recognition

Simultaneous Segmentation and Recognition: Towards more accurate Ego Gesture Recognition

Arxiv

6+阅读 · 2019年9月18日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

4+阅读 · 2019年4月18日

Fine-grained Activity Recognition in Baseball Videos

Arxiv

6+阅读 · 2018年4月9日

End-to-End Fine-Grained Action Segmentation and Recognition Using Conditional Random Field Models and Discriminative Sparse Coding

Arxiv

4+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【斯坦福大学】具有共同注意力的对抗性跨域动作识别（Adversarial Cross-Domain Action Recognition with Co-Attention）

【斯坦福大学】具有共同注意力的对抗性跨域动作识别（Adversarial Cross-Domain Action Recognition with Co-Attention）

专知会员服务

38+阅读 · 2019年12月26日

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

专知会员服务

100+阅读 · 2019年11月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

专知会员服务

39+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】迈向鲁棒的零样本强化学习

一种基于视觉算法生成三维场景重建的多任务系统 | 2025最新200页

【普林斯顿博士论文】量化、评估与缓解现代机器学习系统中的风险

遥感中基于深度学习的领域自适应方法：全面综述

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

人体姿态估计资源大列表（Human Pose Estimation）

人体姿态估计资源大列表（Human Pose Estimation）

专知

9+阅读 · 2018年10月6日

车辆目标检测

车辆目标检测

数据挖掘入门与实战

30+阅读 · 2018年3月30日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

相关论文

Video Instance Segmentation using Inter-Frame Communication Transformers

Arxiv

1+阅读 · 2021年6月7日

End-to-End Video Instance Segmentation with Transformers

Arxiv

10+阅读 · 2021年3月24日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Action Segmentation with Joint Self-Supervised Temporal Domain Adaptation

Arxiv

6+阅读 · 2020年3月18日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Simultaneous Segmentation and Recognition: Towards more accurate Ego Gesture Recognition

Simultaneous Segmentation and Recognition: Towards more accurate Ego Gesture Recognition

Arxiv

6+阅读 · 2019年9月18日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

4+阅读 · 2019年4月18日

Fine-grained Activity Recognition in Baseball Videos

Arxiv

6+阅读 · 2018年4月9日

End-to-End Fine-Grained Action Segmentation and Recognition Using Conditional Random Field Models and Discriminative Sparse Coding

Arxiv

4+阅读 · 2018年1月29日

微信扫码咨询专知VIP会员