Semantic segmentation is a fundamental task for agricultural robots to understand the surrounding environments in natural orchards. The recent development of the LiDAR techniques enables the robot to acquire accurate range measurements of the view in the unstructured orchards. Compared to RGB images, 3D point clouds have geometrical properties. By combining the LiDAR and camera, rich information on geometries and textures can be obtained. In this work, we propose a deep-learning-based segmentation method to perform accurate semantic segmentation on fused data from a LiDAR-Camera visual sensor. Two critical problems are explored and solved in this work. The first one is how to efficiently fused the texture and geometrical features from multi-sensor data. The second one is how to efficiently train the 3D segmentation network under severely imbalance class conditions. Moreover, an implementation of 3D segmentation in orchards including LiDAR-Camera data fusion, data collection and labelling, network training, and model inference is introduced in detail. In the experiment, we comprehensively analyze the network setup when dealing with highly unstructured and noisy point clouds acquired from an apple orchard. Overall, our proposed method achieves 86.2% mIoU on the segmentation of fruits on the high-resolution point cloud (100k-200k points). The experiment results show that the proposed method can perform accurate segmentation in real orchard environments.
翻译:语义分解是农业机器人了解自然果园周围环境的一项基本任务。 最近开发的LiDAR技术使机器人能够在无结构的果园中获取对视图的准确范围测量。 与 RGB 图像相比, 3D点云具有几何特性。 通过将 LiDAR 和相机相结合, 可以获得关于地貌和纹理的丰富信息。 在这项工作中, 我们提议了一种基于深层次的分解方法, 以对来自 LiDAR- Camera 视觉传感器的集成数据进行准确的语义分解。 在这项工作中, 探索并解决了两个关键问题。 第一个是如何有效地将多传感器数据的质谱和几度特征结合起来。 第二个是如何在严重不平衡的等级条件下高效地培训 3D分解网络。 此外, 在包括LiDAR- Camera 数据聚合、 数据收集和标签、 网络培训以及模型在内的果实的分解分解分解方法, 详细引入了两个关键问题。 在实验中, 我们全面分析了从多传感器数据分解点上获取的网络分解结果, 。