Autonomous vehicles need to have a semantic understanding of the three-dimensional world around them in order to reason about their environment. State of the art methods use deep neural networks to predict semantic classes for each point in a LiDAR scan. A powerful and efficient way to process LiDAR measurements is to use two-dimensional, image-like projections. In this work, we perform a comprehensive experimental study of image-based semantic segmentation architectures for LiDAR point clouds. We demonstrate various techniques to boost the performance and to improve runtime as well as memory constraints. First, we examine the effect of network size and suggest that much faster inference times can be achieved at a very low cost to accuracy. Next, we introduce an improved point cloud projection technique that does not suffer from systematic occlusions. We use a cyclic padding mechanism that provides context at the horizontal field-of-view boundaries. In a third part, we perform experiments with a soft Dice loss function that directly optimizes for the intersection-over-union metric. Finally, we propose a new kind of convolution layer with a reduced amount of weight-sharing along one of the two spatial dimensions, addressing the large difference in appearance along the vertical axis of a LiDAR scan. We propose a final set of the above methods with which the model achieves an increase of 3.2% in mIoU segmentation performance over the baseline while requiring only 42% of the original inference time.
翻译:自主飞行器需要对其周围的三维世界进行语义理解,以便了解其环境。 艺术状态的方法使用深神经网络来预测LiDAR扫描中每个点的语义等级。 处理LiDAR测量的强大而高效的方法是使用二维图像式的预测。 在这项工作中,我们对LiDAR点云的基于图像的语义分割结构进行全面的实验性研究。 我们展示了各种提高性能和改进运行时间和记忆限制的技术。 首先,我们研究了网络规模的影响,建议以极低的成本到准确性来更快地实现更快速的推论时间。 接下来,我们引入了更佳的点云投影技术,不因系统封闭而受到影响。 我们使用一个环球压机制,提供横向外观边界的背景。 在第三部分中,我们用软骰子丢失功能进行实验,该功能只能直接优化交叉连接度和记忆限制。 最后,我们提出了一种新的变形层,以较低的时间段速度来达到以较低的42- AR分层的垂直平面,我们提出沿着两个空间面的直径的垂直平方位进行观测。