2D 单发射击的 3D 多物体距离估计锁定距离 (Anchor Distance for 3D Multi-Object Distance Estimation from 2D Single Shot)

Visual perception of the objects in a 3D environment is a key to successful performance in autonomous driving and simultaneous localization and mapping (SLAM). In this paper, we present a real time approach for estimating the distances to multiple objects in a scene using only a single-shot image. Given a 2D Bounding Box (BBox) and object parameters, a 3D distance to the object can be calculated directly using 3D reprojection; however, such methods are prone to significant errors because an error from the 2D detection can be amplified in 3D. In addition, it is also challenging to apply such methods to a real-time system due to the computational burden. In the case of the traditional multi-object detection methods, %they mostly pay attention to existing works have been developed for specific tasks such as object segmentation or 2D BBox regression. These methods introduce the concept of anchor BBox for elaborate 2D BBox estimation, and predictors are specialized and trained for specific 2D BBoxes. In order to estimate the distances to the 3D objects from a single 2D image, we introduce the notion of \textit{anchor distance} based on an object's location and propose a method that applies the anchor distance to the multi-object detector structure. We let the predictors catch the distance prior using anchor distance and train the network based on the distance. The predictors can be characterized to the objects located in a specific distance range. By propagating the distance prior using a distance anchor to the predictors, it is feasible to perform the precise distance estimation and real-time execution simultaneously. The proposed method achieves about 30 FPS speed, and shows the lowest RMSE compared to the existing methods.

翻译：在 3D 环境中, 对天体的视觉感知是自动驱动及同步本地化和映射( SLAM) 成功性运行的关键。在本文中, 我们展示了实时方法, 用于仅使用单发图像来估计场景中多个对象的距离。在 2D 环形框( BBox) 和对象参数下, 可以直接使用 3D 重新投影来计算天体的 3D 距离; 然而, 这种方法容易发生重大错误, 因为 2D 探测错误可以在 3D 中放大。此外, 由于计算负担, 将这种方法应用到一个实时的运行系统。在传统的多目标探测方法中, % 他们主要关注现有的工作是为特定任务开发的, 如对象偏移或 2D Box 回归。这些方法引入了锁定 BBbox 的概念, 用于详细描述 2D BBox 估计, 以及预测器可以专门和训练用于 2D BBoxes 。此外, 将这种方法应用最低的距离到 3D 对象的距离估算距离系统系统的距离系统系统系统系统系统。我们用定位定位定位定位定位定位到定位到定位到之前的距离, 定位到定位到定位到定位到定位到定位到的路径到和的的。