While most recent autonomous driving system focuses on developing perception methods on ego-vehicle sensors, people tend to overlook an alternative approach to leverage intelligent roadside cameras to extend the perception ability beyond the visual range. We discover that the state-of-the-art vision-centric bird's eye view detection methods have inferior performances on roadside cameras. This is because these methods mainly focus on recovering the depth regarding the camera center, where the depth difference between the car and the ground quickly shrinks while the distance increases. In this paper, we propose a simple yet effective approach, dubbed BEVHeight, to address this issue. In essence, instead of predicting the pixel-wise depth, we regress the height to the ground to achieve a distance-agnostic formulation to ease the optimization process of camera-only perception methods. On popular 3D detection benchmarks of roadside cameras, our method surpasses all previous vision-centric methods by a significant margin. The code is available at {\url{https://github.com/ADLab-AutoDrive/BEVHeight}}.
翻译:虽然最近的自主驾驶系统侧重于开发自我车辆传感器的感知方法,但人们倾向于忽略一种替代方法,即利用智能路边摄像机将感知能力扩大到视觉范围以外。我们发现最先进的视觉中心鸟眼视探测方法在路边摄像头上的性能较差。这是因为这种方法主要侧重于恢复摄像中心的深度,在距离增加的同时,汽车与地面之间的深度差异迅速缩小。在本文中,我们提出了解决这一问题的简单而有效的方法,称为BEEVH88。基本上,我们不预测像素一样的深度,而是将高度推向地面,以达到一个远程灵异的配方,以方便仅摄像头的感知方法的优化进程。在流行的3D摄像头探测基准上,我们的方法大大超过以往所有以视觉为中心的方法。代码可在以下的 URl{https://github.com/ADLab-AutoDrive/BEVH8}。</s>