Low-cost monocular 3D object detection plays a fundamental role in autonomous driving, whereas its accuracy is still far from satisfactory. In this paper, we dig into the 3D object detection task and reformulate it as the sub-tasks of object localization and appearance perception, which benefits to a deep excavation of reciprocal information underlying the entire task. We introduce a Dynamic Feature Reflecting Network, named DFR-Net, which contains two novel standalone modules: (i) the Appearance-Localization Feature Reflecting module (ALFR) that first separates taskspecific features and then self-mutually reflects the reciprocal features; (ii) the Dynamic Intra-Trading module (DIT) that adaptively realigns the training processes of various sub-tasks via a self-learning manner. Extensive experiments on the challenging KITTI dataset demonstrate the effectiveness and generalization of DFR-Net. We rank 1st among all the monocular 3D object detectors in the KITTI test set (till March 16th, 2021). The proposed method is also easy to be plug-and-play in many cutting-edge 3D detection frameworks at negligible cost to boost performance. The code will be made publicly available.
翻译:低成本单眼3D物体探测在自主驾驶方面起着根本作用,而其准确性却远不能令人满意。在本文中,我们挖掘到3D物体探测任务,将其重新定位为物体定位和外观观观的子任务,这有利于深入挖掘整个任务背后的相互信息。我们引入了名为DFR-Net的动态地貌映射网络,其中包括两个新的独立模块:(一) 外观-本地化物体映射模块(ALFR),该模块首先将任务特性区分开来,然后自行反映对等特征。 (二) 动态内向模块(DIT),该模块通过自学方式适应性地调整了各子任务的培训过程。关于具有挑战性的KITTI数据集的广泛实验显示DFR-Net的有效性和普遍性。我们在KITTI测试集(截至3月16日,2021日)的所有单眼3D物体探测器中名列第1位。 拟议的方法也很容易在许多尖端3D探测框架中插上。