Monocular 3D object detection is an important task for autonomous driving considering its advantage of low cost. It is much more challenging compared to conventional 2D case due to its inherent ill-posed property, which is mainly reflected on the lack of depth information. Recent progress on 2D detection offers opportunities to better solving this problem. However, it is non-trivial to make a general adapted 2D detector work in this 3D task. In this technical report, we study this problem with a practice built on fully convolutional single-stage detector and propose a general framework FCOS3D. Specifically, we first transform the commonly defined 7-DoF 3D targets to image domain and decouple it as 2D and 3D attributes. Then the objects are distributed to different feature levels with the consideration of their 2D scales and assigned only according to the projected 3D-center for training procedure. Furthermore, the center-ness is redefined with a 2D Guassian distribution based on the 3D-center to fit the 3D target formulation. All of these make this framework simple yet effective, getting rid of any 2D detection or 2D-3D correspondence priors. Our solution achieves 1st place out of all the vision-only methods in the nuScenes 3D detection challenge of NeurIPS 2020. Code and models are released at https://github.com/open-mmlab/mmdetection3d.
翻译:考虑到其低成本优势,自动驱动的3D物体探测是一项重要任务,考虑到其低成本优势,这是自主驱动的一个重要任务。与常规的2D案件相比,其挑战性要大得多,因为其内在的不良属性,这主要反映于缺乏深度信息。最近在2D探测方面取得的进展为更好地解决这一问题提供了机会。然而,在3D任务中,使通用的2D探测器工作适应通用的2D探测器工作是非三维的。此外,在本技术报告中,我们研究这一问题时,我们采用了以完全同步的单阶段探测器为基础的做法,并提出了一个通用框架FCOS3D。具体地说,我们首先将通常定义的7-DoF 3D目标转换为图像域,并脱钩为2D和3D属性。然后,在考虑其2D尺度后,将对象分布到不同的特性级别,并仅仅根据培训程序的预测的 3D中心进行分配。此外,我们用基于 3D- Center 来重新定义的2D Gusian分布方式,并提议一个通用框架。所有这一切都使得这个框架既简单又有效,消除了任何2D-DoF 3D 3D 或2D-3D 3MD 之前的探测方法。我们的方法在S 3MS 之前的解码S 3MS 3MS 的方法在S 3MS 3MVS 3MS 的解出所有的方法是所有的方法。