建造用于Egocentcent 3D粒子估计的时空变异器 (Building Spatio-temporal Transformers for Egocentric 3D Pose Estimation) - 专知论文

会员服务 ·

0

估计/估计量 · 变换 · 3D · INFORMS · Performer ·

2022 年 6 月 9 日

Building Spatio-temporal Transformers for Egocentric 3D Pose Estimation

翻译：建造用于Egocentcent 3D粒子估计的时空变异器

Jinman Park,Kimathi Kaai,Saad Hossain,Norikatsu Sumi,Sirisha Rambhatla,Paul Fieguth

from arxiv, 4 pages, Extended abstract, Joint International Workshop on Egocentric Perception, Interaction and Computing (EPIC) and Ego4D, IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR), 2022

Egocentric 3D human pose estimation (HPE) from images is challenging due to severe self-occlusions and strong distortion introduced by the fish-eye view from the head mounted camera. Although existing works use intermediate heatmap-based representations to counter distortion with some success, addressing self-occlusion remains an open problem. In this work, we leverage information from past frames to guide our self-attention-based 3D HPE estimation procedure -- Ego-STAN. Specifically, we build a spatio-temporal Transformer model that attends to semantically rich convolutional neural network-based feature maps. We also propose feature map tokens: a new set of learnable parameters to attend to these feature maps. Finally, we demonstrate Ego-STAN's superior performance on the xR-EgoPose dataset where it achieves a 30.6% improvement on the overall mean per-joint position error, while leading to a 22% drop in parameters compared to the state-of-the-art.

翻译：图像中的以地球为中心的 3D 人形估计(HPE) 具有挑战性,因为头架相机的鱼眼视图引入了严重的自我隔离和强烈扭曲。虽然现有作品使用中间热映射表示法成功地抵消扭曲,但解决自我封闭仍然是一个尚未解决的问题。在这项工作中,我们利用过去框架的信息来指导我们基于自我注意的3D HPE 估计程序 -- -- Ego-STAN。具体地说,我们建造了一个片段-时空变异器模型,该模型将关注于以语义丰富的神经网络为基础的地貌图。我们还提出了地貌图符号:一套新的可学习参数,用于观看这些地貌图。最后,我们展示了Ego-STAN在 xR-EgoPose数据集上的优异性表现,它使整体平均连接位置错误提高了30.6%。同时导致参数与最新技术相比下降了22%。

0

相关内容

估计/估计量

估计/估计量

【CVPR 2022】基于时空解耦与重耦的RGB-D动作识别 Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

【CVPR 2022】基于时空解耦与重耦的RGB-D动作识别 Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

专知会员服务

14+阅读 · 2022年3月19日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

TRIM39调节PUMA蛋白稳定性的机制及其在肿瘤发生中的生物学功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

内源性大麻素介导自主运动增强学习记忆及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

片仔癀干预骨肉瘤干细胞ABC转运蛋白及PI3K/AKt信号通路逆转耐药的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

mTOR介导的脂质合成和脂质自噬在炎症促进肝脏脂质沉积中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Linked Open Data的Web服务语义互操作关键技术

国家自然科学基金

0+阅读 · 2012年12月31日

mTOR信号通路介导SIRT3调控糖尿病肾病系膜细胞肥大的作用及分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

微生物降解多环芳烃的代谢物分析及其共代谢机理

国家自然科学基金

0+阅读 · 2009年12月31日

Monocular 3D Object Detection with Depth from Motion

Monocular 3D Object Detection with Depth from Motion

Arxiv

0+阅读 · 2022年7月26日

Graph Neural Network and Spatiotemporal Transformer Attention for 3D Video Object Detection from Point Clouds

Arxiv

0+阅读 · 2022年7月26日

A Visual Navigation Perspective for Category-Level Object Pose Estimation

Arxiv

0+阅读 · 2022年7月23日

Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection

Arxiv

0+阅读 · 2022年7月22日

Pyramid Transformer for Traffic Sign Detection

Arxiv

0+阅读 · 2022年7月22日

Focused Decoding Enables 3D Anatomical Detection by Transformers

Arxiv

0+阅读 · 2022年7月21日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

Arxiv

13+阅读 · 2020年12月3日

FocalMix: Semi-Supervised Learning for 3D Medical Image Detection

FocalMix: Semi-Supervised Learning for 3D Medical Image Detection

Arxiv

10+阅读 · 2020年3月20日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【CVPR 2022】基于时空解耦与重耦的RGB-D动作识别 Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

【CVPR 2022】基于时空解耦与重耦的RGB-D动作识别 Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

专知会员服务

14+阅读 · 2022年3月19日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Monocular 3D Object Detection with Depth from Motion

Monocular 3D Object Detection with Depth from Motion

Arxiv

0+阅读 · 2022年7月26日

Graph Neural Network and Spatiotemporal Transformer Attention for 3D Video Object Detection from Point Clouds

Arxiv

0+阅读 · 2022年7月26日

A Visual Navigation Perspective for Category-Level Object Pose Estimation

Arxiv

0+阅读 · 2022年7月23日

Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection

Arxiv

0+阅读 · 2022年7月22日

Pyramid Transformer for Traffic Sign Detection

Arxiv

0+阅读 · 2022年7月22日

Focused Decoding Enables 3D Anatomical Detection by Transformers

Arxiv

0+阅读 · 2022年7月21日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

Arxiv

13+阅读 · 2020年12月3日

FocalMix: Semi-Supervised Learning for 3D Medical Image Detection

FocalMix: Semi-Supervised Learning for 3D Medical Image Detection

Arxiv

10+阅读 · 2020年3月20日

相关基金

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

TRIM39调节PUMA蛋白稳定性的机制及其在肿瘤发生中的生物学功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

内源性大麻素介导自主运动增强学习记忆及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

片仔癀干预骨肉瘤干细胞ABC转运蛋白及PI3K/AKt信号通路逆转耐药的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

mTOR介导的脂质合成和脂质自噬在炎症促进肝脏脂质沉积中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Linked Open Data的Web服务语义互操作关键技术

国家自然科学基金

0+阅读 · 2012年12月31日

mTOR信号通路介导SIRT3调控糖尿病肾病系膜细胞肥大的作用及分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

微生物降解多环芳烃的代谢物分析及其共代谢机理

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员