Collision-free space detection is a critical component of autonomous vehicle perception. The state-of-the-art algorithms are typically based on supervised learning. The performance of such approaches is always dependent on the quality and amount of labeled training data. Additionally, it remains an open challenge to train deep convolutional neural networks (DCNNs) using only a small quantity of training samples. Therefore, this paper mainly explores an effective training data augmentation approach that can be employed to improve the overall DCNN performance, when additional images captured from different views are available. Due to the fact that the pixels of the collision-free space (generally regarded as a planar surface) between two images captured from different views can be associated by a homography matrix, the scenario of the target image can be transformed into the reference view. This provides a simple but effective way of generating training data from additional multi-view images. Extensive experimental results, conducted with six state-of-the-art semantic segmentation DCNNs on three datasets, demonstrate the effectiveness of our proposed training data augmentation algorithm for enhancing collision-free space detection performance. When validated on the KITTI road benchmark, our approach provides the best results for stereo vision-based collision-free space detection.
翻译:使用少量培训样本来培训深卷神经网络(DCNNs)仍然是一项公开的挑战,因此,本文件主要探讨了一种有效的培训数据增强方法,在从不同观点获取更多图像时,可以用来改进DCNN的总体性能。由于从不同观点中获取更多图像时,可以使用一种有效的培训数据增强方法来改进DCNN的总体性能。由于从不同观点中获取的两种图像之间碰撞无空间(一般被视为平面表面)的像素(一般被视为平面)可以由同感矩阵联系起来,目标图像的情景可以转换为参考视图。这提供了一种简单而有效的方法,从更多的多视图像中生成培训数据。在三个数据集上用六种最尖端的语义区段进行的广泛实验结果,展示了我们拟议的培训数据增强无碰撞空间探测性能的最佳算法的有效性。在确认KITTI的无碰撞探测基准时,我们拟议的培训数据增强无碰撞空间探测性能的最佳算法。