模拟长期互动以加强行动确认 (Modeling long-term interactions to enhance action recognition) - 专知论文

会员服务 ·

0

INTERACT · INFORMS · Performer · state-of-the-art · MoDELS ·

2021 年 4 月 23 日

Modeling long-term interactions to enhance action recognition

翻译：模拟长期互动以加强行动确认

Alejandro Cartas,Petia Radeva,Mariella Dimiccoli

from arxiv, Accepted to the 25th International Conference on Pattern Recognition (ICPR), 2021

In this paper, we propose a new approach to under-stand actions in egocentric videos that exploits the semantics of object interactions at both frame and temporal levels. At the frame level, we use a region-based approach that takes as input a primary region roughly corresponding to the user hands and a set of secondary regions potentially corresponding to the interacting objects and calculates the action score through a CNN formulation. This information is then fed to a Hierarchical LongShort-Term Memory Network (HLSTM) that captures temporal dependencies between actions within and across shots. Ablation studies thoroughly validate the proposed approach, showing in particular that both levels of the HLSTM architecture contribute to performance improvement. Furthermore, quantitative comparisons show that the proposed approach outperforms the state-of-the-art in terms of action recognition on standard benchmarks,without relying on motion information

翻译：在本文中,我们建议对以自我为中心的视频中的暗中行为采取新的办法,利用框架和时间层面物体相互作用的语义。在框架层面,我们采用基于区域的办法,将一个与用户手相对应的主要区域和一组可能与互动对象相对应的第二区域作为投入,并通过有线电视新闻网的配方计算行动得分。然后,将这一资料输入一个等级式的长期短期内存网络(HLSTM),这个网络可以捕捉不同镜头之间行动之间的时间依赖性。模拟研究彻底验证了拟议办法,特别表明HLSTM结构的两个级别都有助于业绩的改善。此外,定量比较表明,拟议的办法在不依赖运动信息的情况下,在标准基准的行动承认方面超过了最新水平。

0

相关内容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人机交互领域的研究者和实践者展示其工作的重要平台。多年来，这些会议吸引了来自几个国家和文化的研究人员。官网链接：http://interact2019.org/

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【2020Manning新书】微前端实战，Micro Frontends in Action，296页pdf

【2020Manning新书】微前端实战，Micro Frontends in Action，296页pdf

专知会员服务

55+阅读 · 2020年8月28日

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

专知会员服务

39+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

CVPR2019| 04-03更新10篇论文及代码（3篇oral、含GAN、文本图像生成等）

CVPR2019| 04-03更新10篇论文及代码（3篇oral、含GAN、文本图像生成等）

极市平台

18+阅读 · 2019年4月3日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

carla无人驾驶模拟中文项目 carla_simulator_Chinese

carla无人驾驶模拟中文项目 carla_simulator_Chinese

CreateAMind

3+阅读 · 2018年1月30日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Multi-level Attention Fusion Network for Audio-visual Event Recognition

Arxiv

0+阅读 · 2021年6月12日

Modeling Multi-Label Action Dependencies for Temporal Action Localization

Arxiv

3+阅读 · 2021年3月4日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Multimodal Semantic Attention Network for Video Captioning

Arxiv

4+阅读 · 2019年5月8日

An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition

Arxiv

9+阅读 · 2019年3月29日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

Video-to-Video Synthesis

Video-to-Video Synthesis

Arxiv

9+阅读 · 2018年8月20日

Learning Instance Segmentation by Interaction

Arxiv

6+阅读 · 2018年6月21日

Integrating both Visual and Audio Cues for Enhanced Video Caption

Arxiv

4+阅读 · 2017年12月9日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【2020Manning新书】微前端实战，Micro Frontends in Action，296页pdf

【2020Manning新书】微前端实战，Micro Frontends in Action，296页pdf

专知会员服务

55+阅读 · 2020年8月28日

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

【MIT深度学习课程】深度序列建模，Deep Sequence Modeling

专知会员服务

78+阅读 · 2020年2月3日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

【视频中的零样本动作识别：综述】Zero-Shot Action Recognition in Videos: A Survey

专知会员服务

39+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

CVPR2019| 04-03更新10篇论文及代码（3篇oral、含GAN、文本图像生成等）

CVPR2019| 04-03更新10篇论文及代码（3篇oral、含GAN、文本图像生成等）

极市平台

18+阅读 · 2019年4月3日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

carla 学习笔记

carla 学习笔记

CreateAMind

9+阅读 · 2018年2月7日

carla无人驾驶模拟中文项目 carla_simulator_Chinese

carla无人驾驶模拟中文项目 carla_simulator_Chinese

CreateAMind

3+阅读 · 2018年1月30日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Multi-level Attention Fusion Network for Audio-visual Event Recognition

Arxiv

0+阅读 · 2021年6月12日

Modeling Multi-Label Action Dependencies for Temporal Action Localization

Arxiv

3+阅读 · 2021年3月4日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Multimodal Semantic Attention Network for Video Captioning

Arxiv

4+阅读 · 2019年5月8日

An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition

Arxiv

9+阅读 · 2019年3月29日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

19+阅读 · 2018年12月10日

Video-to-Video Synthesis

Video-to-Video Synthesis

Arxiv

9+阅读 · 2018年8月20日

Learning Instance Segmentation by Interaction

Arxiv

6+阅读 · 2018年6月21日

Integrating both Visual and Audio Cues for Enhanced Video Caption

Arxiv

4+阅读 · 2017年12月9日

微信扫码咨询专知VIP会员