用于单眼深度预测的有核感多物体自监督的多物体自我监督 (Instance-aware multi-object self-supervision for monocular depth prediction) - 专知论文

会员服务 ·

0

Performer · SOTA · INTERACT · Attention · 估计/估计量 ·

2022 年 8 月 9 日

Instance-aware multi-object self-supervision for monocular depth prediction

翻译：用于单眼深度预测的有核感多物体自监督的多物体自我监督

Houssem Boulahbal,Adrian Voicila,Andrew Comport

from arxiv, IROS 2022 and RAL

This paper proposes a self-supervised monocular image-to-depth prediction framework that is trained with an end-to-end photometric loss that handles not only 6-DOF camera motion but also 6-DOF moving object instances. Self-supervision is performed by warping the images across a video sequence using depth and scene motion including object instances. One novelty of the proposed method is the use of the multi-head attention of the transformer network that matches moving objects across time and models their interaction and dynamics. This enables accurate and robust pose estimation for each object instance. Most image-to-depth predication frameworks make the assumption of rigid scenes, which largely degrades their performance with respect to dynamic objects. Only a few SOTA papers have accounted for dynamic objects. The proposed method is shown to outperform these methods on standard benchmarks and the impact of the dynamic motion on these benchmarks is exposed. Furthermore, the proposed image-to-depth prediction framework is also shown to be competitive with SOTA video-to-depth prediction frameworks.

翻译：本文提出一个自我监督的单视图像到深度预测框架,经过培训,对端到端的光度损失进行训练,不仅处理6-DOF摄像运动,而且处理6-DOF移动物体情况。自我监督是通过使用深度和场景运动(包括物体实例)对图像进行扭曲的视频序列进行;拟议方法的一个新颖之处是使用变压器网络的多头目关注,该变压器网络将不同时间移动的物体相匹配,并模拟其相互作用和动态。这样可以对每个对象实例作出准确和稳健的估计。大多数图像到深度的预测框架假设了僵硬的场景,这些场景在很大程度上降低了它们相对于动态物体的性能。只有少数SOTA文件对动态物体进行了说明。拟议的方法显示这些方法超过了标准基准,动态运动对这些基准的影响也暴露了出来。此外,拟议的图像到深度预测框架也显示与SOTA视频到深度预测框架具有竞争力。

0

相关内容

Performer

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

石墨烯量子点/WO3超薄纳米片异质结的制备及光催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

补体C3a在心肌梗死后心脏重塑中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Cys C/Wnt通路在高龄急性心肌梗死心肌损伤中的作用及新机制

国家自然科学基金

0+阅读 · 2015年12月31日

野生型核孔蛋白98(WTNup98)在Nup98融合基因阳性髓系肿瘤发生中的作用及机制探讨

国家自然科学基金

0+阅读 · 2014年12月31日

骨髓间充质干细胞旁分泌CTRP3水平影响心肌梗死疗效及机制

国家自然科学基金

0+阅读 · 2014年12月31日

Trx对鸡心肌细胞能量代谢的影响

国家自然科学基金

0+阅读 · 2013年12月31日

TREM-1/DAP12/ NF-κB信号通路在6-姜烯酚抗动脉粥样硬化中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

GSK-3β调控血管平滑肌细胞特异性转录因子Myocardin对动脉粥样硬化斑块形成作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

PPARγ-miRNA-711信号通路在心肌梗死后心脏重塑中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

Intermedin-53在心肌肥厚中的作用和机制

国家自然科学基金

0+阅读 · 2011年12月31日

Temporally Consistent Video Transformer for Long-Term Video Prediction

Arxiv

0+阅读 · 2022年10月5日

MOTSLAM: MOT-assisted monocular dynamic SLAM using single-view depth estimation

Arxiv

0+阅读 · 2022年10月5日

Cross-Modality Fusion Transformer for Multispectral Object Detection

Arxiv

0+阅读 · 2022年10月4日

Self-Supervised Monocular Depth Estimation: Solving the Edge-Fattening Problem

Arxiv

0+阅读 · 2022年10月4日

Transformers for Object Detection in Large Point Clouds

Arxiv

0+阅读 · 2022年9月30日

Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition

Arxiv

0+阅读 · 2022年9月30日

Learning Depth from Focus in the Wild

Arxiv

0+阅读 · 2022年9月30日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Learning Hierarchical Features for Visual Object Tracking with Recursive Neural Networks

Arxiv

13+阅读 · 2018年1月6日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Temporally Consistent Video Transformer for Long-Term Video Prediction

Arxiv

0+阅读 · 2022年10月5日

MOTSLAM: MOT-assisted monocular dynamic SLAM using single-view depth estimation

Arxiv

0+阅读 · 2022年10月5日

Cross-Modality Fusion Transformer for Multispectral Object Detection

Arxiv

0+阅读 · 2022年10月4日

Self-Supervised Monocular Depth Estimation: Solving the Edge-Fattening Problem

Arxiv

0+阅读 · 2022年10月4日

Transformers for Object Detection in Large Point Clouds

Arxiv

0+阅读 · 2022年9月30日

Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition

Arxiv

0+阅读 · 2022年9月30日

Learning Depth from Focus in the Wild

Arxiv

0+阅读 · 2022年9月30日

Temporal Relational Modeling with Self-Supervision for Action Segmentation

Arxiv

13+阅读 · 2020年12月14日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

Learning Hierarchical Features for Visual Object Tracking with Recursive Neural Networks

Arxiv

13+阅读 · 2018年1月6日

相关基金

石墨烯量子点/WO3超薄纳米片异质结的制备及光催化性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

补体C3a在心肌梗死后心脏重塑中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Cys C/Wnt通路在高龄急性心肌梗死心肌损伤中的作用及新机制

国家自然科学基金

0+阅读 · 2015年12月31日

野生型核孔蛋白98(WTNup98)在Nup98融合基因阳性髓系肿瘤发生中的作用及机制探讨

国家自然科学基金

0+阅读 · 2014年12月31日

骨髓间充质干细胞旁分泌CTRP3水平影响心肌梗死疗效及机制

国家自然科学基金

0+阅读 · 2014年12月31日

Trx对鸡心肌细胞能量代谢的影响

国家自然科学基金

0+阅读 · 2013年12月31日

TREM-1/DAP12/ NF-κB信号通路在6-姜烯酚抗动脉粥样硬化中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

GSK-3β调控血管平滑肌细胞特异性转录因子Myocardin对动脉粥样硬化斑块形成作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

PPARγ-miRNA-711信号通路在心肌梗死后心脏重塑中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

Intermedin-53在心肌肥厚中的作用和机制

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员