Recently, perception task based on Bird's-Eye View (BEV) representation has drawn more and more attention, and BEV representation is promising as the foundation for next-generation Autonomous Vehicle (AV) perception. However, most existing BEV solutions either require considerable resources to execute on-vehicle inference or suffer from modest performance. This paper proposes a simple yet effective framework, termed Fast-BEV , which is capable of performing faster BEV perception on the on-vehicle chips. Towards this goal, we first empirically find that the BEV representation can be sufficiently powerful without expensive transformer based transformation nor depth representation. Our Fast-BEV consists of five parts, We novelly propose (1) a lightweight deployment-friendly view transformation which fast transfers 2D image feature to 3D voxel space, (2) an multi-scale image encoder which leverages multi-scale information for better performance, (3) an efficient BEV encoder which is particularly designed to speed up on-vehicle inference. We further introduce (4) a strong data augmentation strategy for both image and BEV space to avoid over-fitting, (5) a multi-frame feature fusion mechanism to leverage the temporal information. Through experiments, on 2080Ti platform, our R50 model can run 52.6 FPS with 47.3% NDS on the nuScenes validation set, exceeding the 41.3 FPS and 47.5% NDS of the BEVDepth-R50 model and 30.2 FPS and 45.7% NDS of the BEVDet4D-R50 model. Our largest model (R101@900x1600) establishes a competitive 53.5% NDS on the nuScenes validation set. We further develop a benchmark with considerable accuracy and efficiency on current popular on-vehicle chips. The code is released at: https://github.com/Sense-GVT/Fast-BEV.
翻译:最近,基于Bird-Eye View(BEV)代表的感知任务吸引了越来越多的关注,而BEV代表作为下一代自动车辆(AV)的感知基础,很有希望。然而,大多数现有的BEV解决方案要么需要大量资源在车辆上进行推断,要么工作表现不力。本文提出了一个简单而有效的框架,称为Fast-BEV(Fast-BEV),能够对车辆芯片进行更快的BEV感知。为了实现这一目标,我们首先从经验中发现,没有昂贵的变压器或深度代表,BEV代表的影响力就足够了。我们快速BEV由五部分组成:我们的新BEV(S)由轻质部署友好型的视图转换,将2DS图像特性快速传送到3DV Voxel空间,(2)多尺度的图像编码,利用多尺度信息提高性能,(3) 高效的BEVEV-V-VE(VE),特别设计用来加快车辆模型的变速率。我们进一步在图像和BEVVDS(NV)的M-2080)平台上建立强大的数据增强战略-80(S)的模型,(5),在2050号的模型上发展一个多框架的模型,在20号S(R-50x(R-50)的测试平台上,在20号S(R-50号)测试的模型上,在20级的模型上,在20号),可以进一步开发的模型上进一步发展。