SeqNetVLAD vs PointNetVLAD: 日夜地点识别图像序列对 3D 点云 (SeqNetVLAD vs PointNetVLAD: Image Sequence vs 3D Point Clouds for Day-Night Place Recognition)

Place Recognition is a crucial capability for mobile robot localization and navigation. Image-based or Visual Place Recognition (VPR) is a challenging problem as scene appearance and camera viewpoint can change significantly when places are revisited. Recent VPR methods based on ``sequential representations'' have shown promising results as compared to traditional sequence score aggregation or single image based techniques. In parallel to these endeavors, 3D point clouds based place recognition is also being explored following the advances in deep learning based point cloud processing. However, a key question remains: is an explicit 3D structure based place representation always superior to an implicit ``spatial'' representation based on sequence of RGB images which can inherently learn scene structure. In this extended abstract, we attempt to compare these two types of methods by considering a similar ``metric span'' to represent places. We compare a 3D point cloud based method (PointNetVLAD) with image sequence based methods (SeqNet and others) and showcase that image sequence based techniques approach, and can even surpass, the performance achieved by point cloud based methods for a given metric span. These performance variations can be attributed to differences in data richness of input sensors as well as data accumulation strategies for a mobile robot. While a perfect apple-to-apple comparison may not be feasible for these two different modalities, the presented comparison takes a step in the direction of answering deeper questions regarding spatial representations, relevant to several applications like Autonomous Driving and Augmented/Virtual Reality. Source code available publicly https://github.com/oravus/seqNet.

翻译：位置识别是移动机器人本地化和导航的关键能力。图像基础或视觉位置识别( VPR)是一个具有挑战性的问题, 因为在重新审视位置时, 现场外观和相机视角会发生显著变化。最近基于“ 顺序表示” 的 VPR 方法与传统序列评分汇总或单一图像基础技术相比, 显示了有希望的结果。在进行上述努力的同时, 3D点云基于位置的识别也在探索, 在深学习基于点云处理的进展之后( SeqNet 等) 。然而, 关键问题仍然是: 基于图像序列的3D 结构代表始终优于基于 RGB 图像序列的隐含“ 空间” 代表。这些性能变化可以归因于数据直流方向的顺序结构结构结构。在这个扩展的抽象中,我们试图通过考虑类似的“ 度跨度” 来比较这两类方法, 与传统的顺序评分比。我们比较基于3D点的云法(Point NetVLA) 和基于图像顺序的方法( SeqNet 等) 方法, 甚至可以超过基于点云的计算方法在特定度跨度上的表现。。。这些性表现变化表现变化表现的变化可归因于在数据流/ 的深度对比中, 的精确度应用中, 在数据结构上, 的精确度上, 的精确度应用是作为数据流流流流流流的深度的深度分析,,,, 数据的深度分析的比为不同的数据, 的深度度,,,,, 的深度度的深度比为:,,,, 数据的深度的深度的深度比,,,,,,,,, 的的的的的的的的的的的的的的的的的的的的的的的的的的, 作为的的的的的的的的的的的的的的的的的和和的的的的的的的的的的

相关内容

点云

关注 48

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

【InterSpeech2020】混合语音识别系统中的词汇扩展技术，Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems

专知会员服务

17+阅读 · 2020年3月23日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

知识图谱融合方法，140页ppt，南京大学胡伟老师

专知会员服务

145+阅读 · 2020年2月19日