The 3D visual perception for vehicles with the surround-view fisheye camera system is a critical and challenging task for low-cost urban autonomous driving. While existing monocular 3D object detection methods perform not well enough on the fisheye images for mass production, partly due to the lack of 3D datasets of such images. In this paper, we manage to overcome and avoid the difficulty of acquiring the large scale of accurate 3D labeled truth data, by breaking down the 3D object detection task into some sub-tasks, such as vehicle's contact point detection, type classification, re-identification and unit assembling, etc. Particularly, we propose the concept of Multidimensional Vector to include the utilizable information generated in different dimensions and stages, instead of the descriptive approach for the bird's eye view (BEV) or a cube of eight points. The experiments of real fisheye images demonstrate that our solution achieves state-of-the-art accuracy while being real-time in practice.
翻译:使用环视鱼眼摄像系统的车辆的三维视觉感知是低成本城市自主驾驶的一项关键而艰巨的任务。虽然现有的单眼三维天体探测方法在鱼眼图像上对大规模生产效果不佳,部分原因是缺乏这种图像的三维数据集。在本文中,我们设法克服并避免获取大规模准确的三维标签的真象数据的困难,将三维天体探测任务分解为一些子任务,例如车辆的联络点探测、类型分类、再识别和组装等。特别是,我们提议多元矢量媒介的概念,将在不同层面和阶段产生的可用信息纳入其中,而不是鸟眼视描述方法或八点立方。真正的鱼眼图像实验表明,我们的解决办法在实际操作时达到了最新准确度。