M4Depth:对视频序列进行单眼深度估计的基于动议的办法 (M4Depth: A motion-based approach for monocular depth estimation on video sequences) - 专知论文

会员服务 ·

0

估计/估计量 · Pyramid · INFORMS · CARS · Networking ·

2021 年 5 月 21 日

M4Depth: A motion-based approach for monocular depth estimation on video sequences

翻译：M4Depth:对视频序列进行单眼深度估计的基于动议的办法

Michaël Fonder,Damien Ernst,Marc Van Droogenbroeck

from arxiv, Main paper: 8 pages + references, Appendix: 2 pages

Getting the distance to objects is crucial for autonomous vehicles. In instances where depth sensors cannot be used, this distance has to be estimated from RGB cameras. As opposed to cars, the task of estimating depth from on-board mounted cameras is made complex on drones because of the lack of constrains on motion during flights. In this paper, we present a method to estimate the distance of objects seen by an on-board mounted camera by using its RGB video stream and drone motion information. Our method is built upon a pyramidal convolutional neural network architecture and uses time recurrence in pair with geometric constraints imposed by motion to produce pixel-wise depth maps. In our architecture, each level of the pyramid is designed to produce its own depth estimate based on past observations and information provided by the previous level in the pyramid. We introduce a spatial reprojection layer to maintain the spatio-temporal consistency of the data between the levels. We analyse the performance of our approach on Mid-Air, a public drone dataset featuring synthetic drone trajectories recorded in a wide variety of unstructured outdoor environments. Our experiments show that our network outperforms state-of-the-art depth estimation methods and that the use of motion information is the main contributing factor for this improvement. The code of our method is publicly available on GitHub; see https://github.com/michael-fonder/M4Depth

翻译：在无法使用深度传感器的情况下,必须使用 RGB 相机来估计距离。相对于汽车,从机上安装的相机估计深度的任务在无人驾驶飞机上变得复杂,因为飞行期间运动缺乏限制。在本文中,我们提出一种方法来估计机上安装的相机所看到物体的距离,使用其 RGB 视频流和无人机运动信息。我们的方法建在金字塔的同源神经网络结构上,并使用时间重现,同时使用运动所施加的几何限制来制作像素智能深度地图。在我们的结构中,每一级金字塔的设计都是根据以往观测和金字塔先前水平提供的信息来作出自己的深度估计的。我们采用了一个空间再预测层,以保持机上安装的相机所看到物体之间的空间-时空一致性。我们分析了我们在Mid-Air上的方法的性能,一个公共无人机数据集,以合成无人机的轨迹为对象,记录在各种不结构化的室外环境中。我们的网络的每层均设计,根据以往的观察和金字塔先前水平提供的深度估算结果。我们使用的网络是用于移动的深度的模型。

0

相关内容

估计/估计量

估计/估计量

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

专知会员服务

14+阅读 · 2020年6月18日

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

专知会员服务

52+阅读 · 2020年5月26日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

专知会员服务

27+阅读 · 2020年1月17日

【深度估计| 2019最新综述】单目深度估计方法综述（Monocular Depth Estimation: A Survey）

专知会员服务

69+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

12+阅读 · 2019年1月16日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【泡泡一分钟】用于平面环境的线性RGBD-SLAM

【泡泡一分钟】用于平面环境的线性RGBD-SLAM

泡泡机器人SLAM

6+阅读 · 2018年12月18日

【泡泡一分钟】ProbFlow:联合光流和不确定性估计

【泡泡一分钟】ProbFlow:联合光流和不确定性估计

泡泡机器人SLAM

3+阅读 · 2018年10月26日

【泡泡一分钟】基于姿态不变的特征嵌入及时空正则化的车辆重识别(ICCV2017-38)

【泡泡一分钟】基于姿态不变的特征嵌入及时空正则化的车辆重识别(ICCV2017-38)

泡泡机器人SLAM

4+阅读 · 2018年6月19日

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

泡泡机器人SLAM

4+阅读 · 2017年12月18日

Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks

Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks

Arxiv

0+阅读 · 2021年7月12日

Exploring intermediate representation for monocular vehicle pose estimation

Arxiv

0+阅读 · 2021年7月12日

BEV-MODNet: Monocular Camera based Bird's Eye View Moving Object Detection for Autonomous Driving

Arxiv

1+阅读 · 2021年7月11日

Feature-based Event Stereo Visual Odometry

Arxiv

0+阅读 · 2021年7月10日

SelfVIO: Self-Supervised Deep Monocular Visual-Inertial Odometry and Depth Estimation

Arxiv

5+阅读 · 2019年11月22日

Monocular Plan View Networks for Autonomous Driving

Monocular Plan View Networks for Autonomous Driving

Arxiv

6+阅读 · 2019年5月16日

Joint Monocular 3D Vehicle Detection and Tracking

Joint Monocular 3D Vehicle Detection and Tracking

Arxiv

8+阅读 · 2018年12月2日

LIMO: Lidar-Monocular Visual Odometry

LIMO: Lidar-Monocular Visual Odometry

Arxiv

3+阅读 · 2018年7月19日

Detect-and-Track: Efficient Pose Estimation in Videos

Arxiv

5+阅读 · 2018年5月2日

Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor

Arxiv

4+阅读 · 2017年12月19日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，56页ppt，Neural Topological SLAM for Visual Navigation

专知会员服务

14+阅读 · 2020年6月18日

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

【CVPR2020】视觉导航的神经拓扑SLAM，Neural Topological SLAM for Visual Navigation

专知会员服务

52+阅读 · 2020年5月26日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

运动物体检测与运动相机:一个全面的综述：Moving Objects Detection with a Moving Camera: A Comprehensive Review

专知会员服务

27+阅读 · 2020年1月17日

【深度估计| 2019最新综述】单目深度估计方法综述（Monocular Depth Estimation: A Survey）

专知会员服务

69+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【新书】Python编程基础，669页pdf

【新书】Python编程基础，669页pdf

专知会员服务

196+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

【泡泡一分钟】FarSight：从户外图像中实现远距离深度估计

泡泡机器人SLAM

11+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

【泡泡一分钟】LIMO：激光和单目相机融合的视觉里程计

泡泡机器人SLAM

12+阅读 · 2019年1月16日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【泡泡一分钟】用于平面环境的线性RGBD-SLAM

【泡泡一分钟】用于平面环境的线性RGBD-SLAM

泡泡机器人SLAM

6+阅读 · 2018年12月18日

【泡泡一分钟】ProbFlow:联合光流和不确定性估计

【泡泡一分钟】ProbFlow:联合光流和不确定性估计

泡泡机器人SLAM

3+阅读 · 2018年10月26日

【泡泡一分钟】基于姿态不变的特征嵌入及时空正则化的车辆重识别(ICCV2017-38)

【泡泡一分钟】基于姿态不变的特征嵌入及时空正则化的车辆重识别(ICCV2017-38)

泡泡机器人SLAM

4+阅读 · 2018年6月19日

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

泡泡机器人SLAM

4+阅读 · 2017年12月18日

相关论文

Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks

Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks

Arxiv

0+阅读 · 2021年7月12日

Exploring intermediate representation for monocular vehicle pose estimation

Arxiv

0+阅读 · 2021年7月12日

BEV-MODNet: Monocular Camera based Bird's Eye View Moving Object Detection for Autonomous Driving

Arxiv

1+阅读 · 2021年7月11日

Feature-based Event Stereo Visual Odometry

Arxiv

0+阅读 · 2021年7月10日

SelfVIO: Self-Supervised Deep Monocular Visual-Inertial Odometry and Depth Estimation

Arxiv

5+阅读 · 2019年11月22日

Monocular Plan View Networks for Autonomous Driving

Monocular Plan View Networks for Autonomous Driving

Arxiv

6+阅读 · 2019年5月16日

Joint Monocular 3D Vehicle Detection and Tracking

Joint Monocular 3D Vehicle Detection and Tracking

Arxiv

8+阅读 · 2018年12月2日

LIMO: Lidar-Monocular Visual Odometry

LIMO: Lidar-Monocular Visual Odometry

Arxiv

3+阅读 · 2018年7月19日

Detect-and-Track: Efficient Pose Estimation in Videos

Arxiv

5+阅读 · 2018年5月2日

Video Classification With CNNs: Using The Codec As A Spatio-Temporal Activity Sensor

Arxiv

4+阅读 · 2017年12月19日

微信扫码咨询专知VIP会员