Bird's Eye View (BEV) representations are tremendously useful for perception-related automated driving tasks. However, generating BEVs from surround-view fisheye camera images is challenging due to the strong distortions introduced by such wide-angle lenses. We take the first step in addressing this challenge and introduce a baseline, F2BEV, to generate BEV height maps and semantic segmentation maps from fisheye images. F2BEV consists of a distortion-aware spatial cross attention module for querying and consolidating spatial information from fisheye image features in a transformer-style architecture followed by a task-specific head. We evaluate single-task and multi-task variants of F2BEV on our synthetic FB-SSEM dataset, all of which generate better BEV height and segmentation maps (in terms of the IoU) than a state-of-the-art BEV generation method operating on undistorted fisheye images. We also demonstrate height map generation from real-world fisheye images using F2BEV. An initial sample of our dataset is publicly available at https://tinyurl.com/58jvnscy
翻译:鸟类眼视(BEV) 表示方式对于与视觉有关的自动驱动任务极为有用。 然而,通过环形鱼眼相机图像生成BEEV具有挑战性,因为这种广角镜片的扭曲作用。我们迈出了应对这一挑战的第一步,并引入了基准(F2BEV),从鱼眼图像中生成BEV高度图和语义分解图。F2BEV包含一个扭曲的跨度空间关注模块,用于在变压器式结构中查询和整合从鱼眼图像特征获得的空间信息,然后有一个任务型头。我们在我们的合成FB-SEM数据集中评估F2BEV的单塔斯克和多塔斯克变异体,所有这些都产生更好的BEV高度和分化图(IOU),而不是在非扭曲的鱼眼图像上运行的状态型BEV生成方法。我们还展示了使用F2BEVEV的实时鱼眼图像生成的高度图。我们的数据设置的初步样本可公开查阅 https://tinurl.com/58vcynsynsynsynsy。</s>