3D 地图上的以种族为中心的活动识别和定位 (Egocentric Activity Recognition and Localization on a 3D Map) - 专知论文

会员服务 ·

0

3D · 回合 · Extensibility · 潜变量/隐变量 · MoDELS ·

2021 年 5 月 27 日

Egocentric Activity Recognition and Localization on a 3D Map

翻译：3D 地图上的以种族为中心的活动识别和定位

Miao Liu,Lingni Ma,Kiran Somasundaram,Yin Li,Kristen Grauman,James M. Rehg,Chao Li

Given a video captured from a first person perspective and recorded in a familiar environment, can we recognize what the person is doing and identify where the action occurs in the 3D space? We address this challenging problem of jointly recognizing and localizing actions of a mobile user on a known 3D map from egocentric videos. To this end, we propose a novel deep probabilistic model. Our model takes the inputs of a Hierarchical Volumetric Representation (HVR) of the environment and an egocentric video, infers the 3D action location as a latent variable, and recognizes the action based on the video and contextual cues surrounding its potential locations. To evaluate our model, we conduct extensive experiments on a newly collected egocentric video dataset, in which both human naturalistic actions and photo-realistic 3D environment reconstructions are captured. Our method demonstrates strong results on both action recognition and 3D action localization across seen and unseen environments. We believe our work points to an exciting research direction in the intersection of egocentric vision, and 3D scene understanding.

翻译：根据从第一人的角度拍摄并在熟悉的环境中录制的视频,我们能否认识一个人在做什么,并确定3D空间的行动在哪里发生?我们从自我中心视频的已知的 3D 地图上共同识别移动用户的行动并将其定位这一具有挑战性的问题?为此,我们提出一个新的深层次概率模型。我们的模型采用环境的高度量子代表(HVR)和以自我为中心的视频的投入,将3D行动位置推断为潜伏变量,并承认基于其潜在位置的视频和背景线索的行动。为了评估我们的模型,我们对新收集的以自我为中心的视频数据集进行了广泛的实验,其中记录了人类自然行动和摄影现实的3D环境重建。我们的方法展示了在可见和看不见环境中的行动识别和3D行动定位的强有力结果。我们相信我们的工作点有助于在自我中心愿景和3D场理解的交叉点上找到令人兴奋的研究方向。

0

相关内容

3D是英文“Three Dimensions”的简称，中文是指三维、三个维度、三个坐标，即有长、有宽、有高，换句话说，就是立体的，是相对于只有长和宽的平面（2D）而言。

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

专知会员服务

25+阅读 · 2020年7月1日

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

专知会员服务

14+阅读 · 2020年6月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【论文推荐】几何图形卷积网络，GEOM-GCN: GEOMETRIC GRAPH CONVOLUTIONAL NETWORKS

【论文推荐】几何图形卷积网络，GEOM-GCN: GEOMETRIC GRAPH CONVOLUTIONAL NETWORKS

专知会员服务

77+阅读 · 2020年2月5日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

八篇 ICCV 2019 【图神经网络（GNN）+CV】相关论文

八篇 ICCV 2019 【图神经网络（GNN）+CV】相关论文

专知会员服务

30+阅读 · 2020年1月10日

【CVPR 2019 | tutorial】野外家庭的视觉识别： Visual Recognition of Families In the Wild

【CVPR 2019 | tutorial】野外家庭的视觉识别： Visual Recognition of Families In the Wild

专知会员服务

10+阅读 · 2019年11月28日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

跟踪SLAM前沿动态系列之ICCV2019

跟踪SLAM前沿动态系列之ICCV2019

泡泡机器人SLAM

7+阅读 · 2019年11月23日

计算机 | 国际会议信息5条

计算机 | 国际会议信息5条

Call4Papers

3+阅读 · 2019年7月3日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

泡泡机器人SLAM

17+阅读 · 2018年10月12日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

15+阅读 · 2018年2月13日

计算机类 | 国际会议信息7条

计算机类 | 国际会议信息7条

Call4Papers

3+阅读 · 2017年11月17日

CORAL: Colored structural representation for bi-modal place recognition

Arxiv

0+阅读 · 2021年7月19日

Dynamic Convolution for 3D Point Cloud Instance Segmentation

Arxiv

0+阅读 · 2021年7月18日

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

Arxiv

4+阅读 · 2021年1月17日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Discovery and recognition of motion primitives in human activities

Discovery and recognition of motion primitives in human activities

Arxiv

4+阅读 · 2019年2月4日

Joint Monocular 3D Vehicle Detection and Tracking

Joint Monocular 3D Vehicle Detection and Tracking

Arxiv

8+阅读 · 2018年12月2日

Stereo Magnification: Learning View Synthesis using Multiplane Images

Arxiv

5+阅读 · 2018年5月24日

Fine-grained Activity Recognition in Baseball Videos

Arxiv

6+阅读 · 2018年4月9日

Long-term Visual Localization using Semantically Segmented Images

Arxiv

7+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

潜变量/隐变量

相关VIP内容

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

【ICML2020-伯克利-马毅老师组】深度等距学习的视觉识别，Deep Isometric Learning for Visual Recognition

专知会员服务

25+阅读 · 2020年7月1日

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

专知会员服务

14+阅读 · 2020年6月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【论文推荐】几何图形卷积网络，GEOM-GCN: GEOMETRIC GRAPH CONVOLUTIONAL NETWORKS

【论文推荐】几何图形卷积网络，GEOM-GCN: GEOMETRIC GRAPH CONVOLUTIONAL NETWORKS

专知会员服务

77+阅读 · 2020年2月5日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

八篇 ICCV 2019 【图神经网络（GNN）+CV】相关论文

八篇 ICCV 2019 【图神经网络（GNN）+CV】相关论文

专知会员服务

30+阅读 · 2020年1月10日

【CVPR 2019 | tutorial】野外家庭的视觉识别： Visual Recognition of Families In the Wild

【CVPR 2019 | tutorial】野外家庭的视觉识别： Visual Recognition of Families In the Wild

专知会员服务

10+阅读 · 2019年11月28日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

相关资讯

跟踪SLAM前沿动态系列之ICCV2019

跟踪SLAM前沿动态系列之ICCV2019

泡泡机器人SLAM

7+阅读 · 2019年11月23日

计算机 | 国际会议信息5条

计算机 | 国际会议信息5条

Call4Papers

3+阅读 · 2019年7月3日

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

计算机 | 入门级EI会议ICVRIS 2019诚邀稿件

Call4Papers

10+阅读 · 2019年6月24日

计算机 | IUI 2020等国际会议信息4条

计算机 | IUI 2020等国际会议信息4条

Call4Papers

6+阅读 · 2019年6月17日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

计算机类 | ISCC 2019等国际会议信息9条

计算机类 | ISCC 2019等国际会议信息9条

Call4Papers

5+阅读 · 2018年12月25日

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

【泡泡一分钟】SSD6D：基于RGB的三维检测和6自由度位姿估计(ICCV2017-159)

泡泡机器人SLAM

17+阅读 · 2018年10月12日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

15+阅读 · 2018年2月13日

计算机类 | 国际会议信息7条

计算机类 | 国际会议信息7条

Call4Papers

3+阅读 · 2017年11月17日

相关论文

CORAL: Colored structural representation for bi-modal place recognition

Arxiv

0+阅读 · 2021年7月19日

Dynamic Convolution for 3D Point Cloud Instance Segmentation

Arxiv

0+阅读 · 2021年7月18日

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

Arxiv

4+阅读 · 2021年1月17日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Discovery and recognition of motion primitives in human activities

Discovery and recognition of motion primitives in human activities

Arxiv

4+阅读 · 2019年2月4日

Joint Monocular 3D Vehicle Detection and Tracking

Joint Monocular 3D Vehicle Detection and Tracking

Arxiv

8+阅读 · 2018年12月2日

Stereo Magnification: Learning View Synthesis using Multiplane Images

Arxiv

5+阅读 · 2018年5月24日

Fine-grained Activity Recognition in Baseball Videos

Arxiv

6+阅读 · 2018年4月9日

Long-term Visual Localization using Semantically Segmented Images

Arxiv

7+阅读 · 2018年1月16日

微信扫码咨询专知VIP会员