3D 地图上的以种族为中心的活动识别和定位 (Egocentric Activity Recognition and Localization on a 3D Map) - 专知论文

会员服务 ·

0

3D · 回合 · Extensibility · 潜变量/隐变量 · MoDELS ·

2021 年 5 月 20 日

Egocentric Activity Recognition and Localization on a 3D Map

翻译：3D 地图上的以种族为中心的活动识别和定位

Miao Liu,Lingni Ma,Kiran Somasundaram,Yin Li,Kristen Grauman,James M. Rehg,Chao Li

Given a video captured from a first person perspective and recorded in a familiar environment, can we recognize what the person is doing and identify where the action occurs in the 3D space? We address this challenging problem of jointly recognizing and localizing actions of a mobile user on a known 3D map from egocentric videos. To this end, we propose a novel deep probabilistic model. Our model takes the inputs of a Hierarchical Volumetric Representation (HVR) of the environment and an egocentric video, infers the 3D action location as a latent variable, and recognizes the action based on the video and contextual cues surrounding its potential locations. To evaluate our model, we conduct extensive experiments on a newly collected egocentric video dataset, in which both human naturalistic actions and photo-realistic 3D environment reconstructions are captured. Our method demonstrates strong results on both action recognition and 3D action localization across seen and unseen environments. We believe our work points to an exciting research direction in the intersection of egocentric vision, and 3D scene understanding.

翻译：根据从第一人的角度拍摄并在熟悉的环境中录制的视频,我们能否认识一个人在做什么,并确定3D空间的行动在哪里发生?我们从自我中心视频的已知的 3D 地图上共同识别移动用户的行动并将其定位这一具有挑战性的问题?为此,我们提出一个新的深层次概率模型。我们的模型采用环境的高度量子代表(HVR)和以自我为中心的视频的投入,将3D行动位置推断为潜伏变量,并承认基于其潜在位置的视频和背景线索的行动。为了评估我们的模型,我们对新收集的以自我为中心的视频数据集进行了广泛的实验,其中记录了人类自然行动和摄影现实的3D环境重建。我们的方法展示了在可见和看不见环境中的行动识别和3D行动定位的强有力结果。我们相信我们的工作点有助于在自我中心愿景和3D场理解的交叉点上找到令人兴奋的研究方向。

0

相关内容

3D是英文“Three Dimensions”的简称，中文是指三维、三个维度、三个坐标，即有长、有宽、有高，换句话说，就是立体的，是相对于只有长和宽的平面（2D）而言。

【图与几何深度学习，53页ppt】Graph and geometric deep learning

专知会员服务

90+阅读 · 2021年6月14日

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【图与几何深度学习】Graph and geometric deep learning，49页ppt

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

专知会员服务

25+阅读 · 2020年7月1日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

以自我为中心的视觉分析综述（Analysis of the hands in egocentric vision: A survey）

以自我为中心的视觉分析综述（Analysis of the hands in egocentric vision: A survey）

专知会员服务

5+阅读 · 2019年12月25日

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

专知会员服务

100+阅读 · 2019年11月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【CVPR 2019 | tutorial】视觉识别Visual Recognition and Beyond，Facebook|Ross Girshick，Justin Johnson（李飞飞高徒）

【CVPR 2019 | tutorial】视觉识别Visual Recognition and Beyond，Facebook|Ross Girshick，Justin Johnson（李飞飞高徒）

专知会员服务

29+阅读 · 2019年6月16日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

计算机类 | 低难度国际会议信息6条

计算机类 | 低难度国际会议信息6条

Call4Papers

6+阅读 · 2019年4月28日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

人工智能 | 国际会议信息6条

人工智能 | 国际会议信息6条

Call4Papers

5+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

计算机类 | 国际会议信息7条

计算机类 | 国际会议信息7条

Call4Papers

3+阅读 · 2017年11月17日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

Review of Video Predictive Understanding: Early ActionRecognition and Future Action Prediction

Arxiv

0+阅读 · 2021年7月11日

Feature-based Event Stereo Visual Odometry

Arxiv

0+阅读 · 2021年7月10日

Self-supervised Geometric Perception

Arxiv

24+阅读 · 2021年3月4日

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

Arxiv

4+阅读 · 2021年1月17日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

4+阅读 · 2019年4月18日

Discovery and recognition of motion primitives in human activities

Discovery and recognition of motion primitives in human activities

Arxiv

4+阅读 · 2019年2月4日

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

Arxiv

3+阅读 · 2018年5月1日

Fine-grained Activity Recognition in Baseball Videos

Arxiv

6+阅读 · 2018年4月9日

Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor

Arxiv

4+阅读 · 2017年12月19日

VIP会员

文章信息

相关主题

潜变量/隐变量

相关VIP内容

【图与几何深度学习，53页ppt】Graph and geometric deep learning

专知会员服务

90+阅读 · 2021年6月14日

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【图与几何深度学习】Graph and geometric deep learning，49页ppt

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

专知会员服务

25+阅读 · 2020年7月1日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

以自我为中心的视觉分析综述（Analysis of the hands in egocentric vision: A survey）

以自我为中心的视觉分析综述（Analysis of the hands in egocentric vision: A survey）

专知会员服务

5+阅读 · 2019年12月25日

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

专知会员服务

100+阅读 · 2019年11月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

【CVPR 2019 | tutorial】视觉识别Visual Recognition and Beyond，Facebook|Ross Girshick，Justin Johnson（李飞飞高徒）

【CVPR 2019 | tutorial】视觉识别Visual Recognition and Beyond，Facebook|Ross Girshick，Justin Johnson（李飞飞高徒）

专知会员服务

29+阅读 · 2019年6月16日

热门VIP内容

开通专知VIP会员享更多权益服务

《人与智能体在系统工程建模语言V2任务中的性能表现：基于用户中心化的评估方法》308页

《数据安全国家标准体系（2025版）》征求意见稿

AlphaMosaic：人工智能赋能的作战管理系统

《军事行动中通信平台的战略价值：提升战术效能与作战优势》

相关资讯

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

计算机类 | 低难度国际会议信息6条

计算机类 | 低难度国际会议信息6条

Call4Papers

6+阅读 · 2019年4月28日

简评 | Video Action Recognition 的近期进展

简评 | Video Action Recognition 的近期进展

极市平台

20+阅读 · 2019年4月21日

人工智能 | 国际会议信息6条

人工智能 | 国际会议信息6条

Call4Papers

5+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

计算机类 | 国际会议信息7条

计算机类 | 国际会议信息7条

Call4Papers

3+阅读 · 2017年11月17日

【推荐】视频目标分割基础

【推荐】视频目标分割基础

机器学习研究会

9+阅读 · 2017年9月19日

相关论文

Review of Video Predictive Understanding: Early ActionRecognition and Future Action Prediction

Arxiv

0+阅读 · 2021年7月11日

Feature-based Event Stereo Visual Odometry

Arxiv

0+阅读 · 2021年7月10日

Self-supervised Geometric Perception

Arxiv

24+阅读 · 2021年3月4日

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

Arxiv

4+阅读 · 2021年1月17日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

SlowFast Networks for Video Recognition

SlowFast Networks for Video Recognition

Arxiv

4+阅读 · 2019年4月18日

Discovery and recognition of motion primitives in human activities

Discovery and recognition of motion primitives in human activities

Arxiv

4+阅读 · 2019年2月4日

Learning Human Pose Models from Synthesized Data for Robust RGB-D Action Recognition

Arxiv

3+阅读 · 2018年5月1日

Fine-grained Activity Recognition in Baseball Videos

Arxiv

6+阅读 · 2018年4月9日

Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor

Arxiv

4+阅读 · 2017年12月19日

微信扫码咨询专知VIP会员