This paper presents DeepI2P: a novel approach for cross-modality registration between an image and a point cloud. Given an image (e.g. from a rgb-camera) and a general point cloud (e.g. from a 3D Lidar scanner) captured at different locations in the same scene, our method estimates the relative rigid transformation between the coordinate frames of the camera and Lidar. Learning common feature descriptors to establish correspondences for the registration is inherently challenging due to the lack of appearance and geometric correlations across the two modalities. We circumvent the difficulty by converting the registration problem into a classification and inverse camera projection optimization problem. A classification neural network is designed to label whether the projection of each point in the point cloud is within or beyond the camera frustum. These labeled points are subsequently passed into a novel inverse camera projection solver to estimate the relative pose. Extensive experimental results on Oxford Robotcar and KITTI datasets demonstrate the feasibility of our approach. Our source code is available at https://github.com/lijx10/DeepI2P
翻译:本文展示了 DeepI2P : 图像和点云之间交叉模式登记的新颖方法。 鉴于在同一场景的不同地点拍摄的图像( 例如来自 rb- camera ) 和一般点云( 例如来自 3D Lidar 扫描仪 ), 我们的方法估计了相机和 Lidar 的坐标框架之间的相对僵硬变化。 学习用于为注册建立通信的通用特征描述符 具有内在挑战性, 原因是两种模式之间缺乏外观和几何相关性。 我们绕过将注册问题转换为分类和反相机投影优化问题的难度。 一个分类神经网络旨在标注点云的投影是否在或超出摄像系统条状体范围。 这些标注点随后被传递到一个新的反向相机投影解器中以估计相对面。 牛津机器人和KITTI 数据集的广泛实验结果证明了我们的方法的可行性。 我们的源代码可在 https://github.com/lijx10/DepI2P 上查阅。