Scene understanding is crucial for autonomous robots in dynamic environments for making future state predictions, avoiding collisions, and path planning. Camera and LiDAR perception made tremendous progress in recent years, but face limitations under adverse weather conditions. To leverage the full potential of multi-modal sensor suites, radar sensors are essential for safety critical tasks and are already installed in most new vehicles today. In this paper, we address the problem of semantic segmentation of moving objects in radar point clouds to enhance the perception of the environment with another sensor modality. Instead of aggregating multiple scans to densify the point clouds, we propose a novel approach based on the self-attention mechanism to accurately perform sparse, single-scan segmentation. Our approach, called Gaussian Radar Transformer, includes the newly introduced Gaussian transformer layer, which replaces the softmax normalization by a Gaussian function to decouple the contribution of individual points. To tackle the challenge of the transformer to capture long-range dependencies, we propose our attentive up- and downsampling modules to enlarge the receptive field and capture strong spatial relations. We compare our approach to other state-of-the-art methods on the RadarScenes data set and show superior segmentation quality in diverse environments, even without exploiting temporal information.
翻译:对动态环境中的自主机器人来说,在动态环境中的自主机器人理解对于未来国家预测、避免碰撞和路径规划至关重要。相机和LIDAR的感知近年来取得了巨大的进步,但在不利的天气条件下却面临种种限制。为了充分利用多式传感器套件的全部潜力,雷达传感器对于安全关键任务至关重要,而且今天已经安装在大多数新车辆中。在本文件中,我们处理雷达点云中移动物体的语法分解问题,以便用另一种传感器模式增强对环境的认识。我们建议采用基于自我感知机制的新办法,以精确地进行稀疏的、单扫描的分解。我们称为高斯雷达变异器的方法包括新引入的高斯变异器层,以高斯函数取代软式变异器,以分解各个点的贡献。为了应对变异器在捕捉远程依赖性方面的挑战,我们建议采用多式扫描和低调的模块,以扩大可接收的场面,并获取更强的空间关系。我们称为高斯雷达变换器的方法,我们把我们的方法与其他州级数据比起来,在不使用更高级的时段。