面向一个-一个-T Spadio-T 用于基于Sleton的行动确认的时时焦点 (Towards To-a-T Spatio-Temporal Focus for Skeleton-Based Action Recognition) - 专知论文

会员服务 ·

0

新加坡南洋理工大学 · 图卷积神经网络/图卷积网络 · Better · state-of-the-art · MoDELS ·

2022 年 2 月 4 日

Towards To-a-T Spatio-Temporal Focus for Skeleton-Based Action Recognition

翻译：面向一个-一个-T Spadio-T 用于基于Sleton的行动确认的时时焦点

Lipeng Ke,Kuan-Chuan Peng,Siwei Lyu

from arxiv, AAAI 2022

Graph Convolutional Networks (GCNs) have been widely used to model the high-order dynamic dependencies for skeleton-based action recognition. Most existing approaches do not explicitly embed the high-order spatio-temporal importance to joints' spatial connection topology and intensity, and they do not have direct objectives on their attention module to jointly learn when and where to focus on in the action sequence. To address these problems, we propose the To-a-T Spatio-Temporal Focus (STF), a skeleton-based action recognition framework that utilizes the spatio-temporal gradient to focus on relevant spatio-temporal features. We first propose the STF modules with learnable gradient-enforced and instance-dependent adjacency matrices to model the high-order spatio-temporal dynamics. Second, we propose three loss terms defined on the gradient-based spatio-temporal focus to explicitly guide the classifier when and where to look at, distinguish confusing classes, and optimize the stacked STF modules. STF outperforms the state-of-the-art methods on the NTU RGB+D 60, NTU RGB+D 120, and Kinetics Skeleton 400 datasets in all 15 settings over different views, subjects, setups, and input modalities, and STF also shows better accuracy on scarce data and dataset shifting settings.

翻译：为了解决这些问题,我们广泛使用图-T Spatio-Temporal Focus(GCNs)来模拟基于骨骼的动作识别的高顺序动态依赖性(GCNs),大多数现有办法并未明确将高阶时空对联合空间连接表层学和强度的重要性嵌入高阶空间连接表层和强度,而且它们也没有直接的目标,即关注模块,以在行动序列中共同学习何时和在何处应重点关注的问题。为了解决这些问题,我们建议了T-a-T Spatio-Timal Formal Focus(STF),一个基于400个基于骨架的行动识别框架,利用时空梯度梯度梯度来重点关注相关的时空特征。我们首先建议了具有可学习性梯度增强和因实例而依赖的相近度矩阵模块模块,用于模拟高阶调时空动态。第二,我们提议了三个损失术语,即基于梯度的调调调调调调点,以明确指导分类者,区分混淆的等级和优化堆叠的STF模块。

0

相关内容

新加坡南洋理工大学

新加坡南洋理工大学

新加坡南洋理工大学（南洋理工大学新加坡分校）是一所研究型公立大学，拥有工程、商业、科学、人文、艺术、社会科学、教育和医学的33000名本科生和研究生。

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

专知会员服务

100+阅读 · 2019年11月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

基于时空模式的复杂行为识别方法研究

国家自然科学基金

2+阅读 · 2017年12月31日

大型网络中基于局部谱的社团检测算法研究

国家自然科学基金

0+阅读 · 2017年12月31日

自动管电流调制CT辐射剂量与影像质量评价方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

中继系统最优物理层安全波形关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

土木工程中CFRP构件的涡流热成像损伤检测方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

目标函数多次波逆时叠前偏移

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

高流速下缓蚀剂在H2S和CO2环境中的构效关系

国家自然科学基金

0+阅读 · 2011年12月31日

震源参数对基于有限断层长周期地震动数值模拟的影响研究

国家自然科学基金

0+阅读 · 2009年12月31日

二维海面电磁散射数值算法与场统计分布研究

国家自然科学基金

0+阅读 · 2009年12月31日

Nested Collaborative Learning for Long-Tailed Visual Recognition

Arxiv

0+阅读 · 2022年4月19日

An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition

Arxiv

0+阅读 · 2022年4月19日

Bootstrapped Representation Learning for Skeleton-Based Action Recognition

Arxiv

0+阅读 · 2022年4月19日

Multi-View Spatial-Temporal Network for Continuous Sign Language Recognition

Arxiv

0+阅读 · 2022年4月19日

Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition

Arxiv

0+阅读 · 2022年4月18日

TSception: Capturing Temporal Dynamics and Spatial Asymmetry from EEG for Emotion Recognition

Arxiv

0+阅读 · 2022年4月18日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Arxiv

12+阅读 · 2020年2月27日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

VIP会员

文章信息

相关主题

新加坡南洋理工大学

图卷积神经网络/图卷积网络

state-of-the-art

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

专知会员服务

100+阅读 · 2019年11月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

GPT-5如何对齐？从硬性拒绝到安全完成：走向以输出为中心的安全训练

【伯克利博士论文】超越人类监督的视觉智能

【ICCV2025】SO(3) 上连续非保守动力系统的预测

2025年中国数据要素行业发展研究报告

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Nested Collaborative Learning for Long-Tailed Visual Recognition

Arxiv

0+阅读 · 2022年4月19日

An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition

Arxiv

0+阅读 · 2022年4月19日

Bootstrapped Representation Learning for Skeleton-Based Action Recognition

Arxiv

0+阅读 · 2022年4月19日

Multi-View Spatial-Temporal Network for Continuous Sign Language Recognition

Arxiv

0+阅读 · 2022年4月19日

Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition

Arxiv

0+阅读 · 2022年4月18日

TSception: Capturing Temporal Dynamics and Spatial Asymmetry from EEG for Emotion Recognition

Arxiv

0+阅读 · 2022年4月18日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image

Arxiv

12+阅读 · 2020年2月27日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

相关基金

基于时空模式的复杂行为识别方法研究

国家自然科学基金

2+阅读 · 2017年12月31日

大型网络中基于局部谱的社团检测算法研究

国家自然科学基金

0+阅读 · 2017年12月31日

自动管电流调制CT辐射剂量与影像质量评价方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

中继系统最优物理层安全波形关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

土木工程中CFRP构件的涡流热成像损伤检测方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

目标函数多次波逆时叠前偏移

国家自然科学基金

0+阅读 · 2012年12月31日

关于AI-半环簇与 Conway半环簇的研究

国家自然科学基金

1+阅读 · 2012年12月31日

高流速下缓蚀剂在H2S和CO2环境中的构效关系

国家自然科学基金

0+阅读 · 2011年12月31日

震源参数对基于有限断层长周期地震动数值模拟的影响研究

国家自然科学基金

0+阅读 · 2009年12月31日

二维海面电磁散射数值算法与场统计分布研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员