Deep learning approaches have become the standard solution to many problems in computer vision and robotics, but obtaining sufficient training data in high enough quality is challenging, as human labor is error prone, time consuming, and expensive. Solutions based on simulation have become more popular in recent years, but the gap between simulation and reality is still a major issue. In this paper, we introduce a novel method for augmenting synthetic image data through unsupervised image-to-image translation by applying the style of real world images to simulated images with open source frameworks. The generated dataset is combined with conventional augmentation methods and is then applied to a neural network model running in real-time on autonomous soccer robots. Our evaluation shows a significant improvement compared to models trained on images generated entirely in simulation.
翻译:深层次的学习方法已成为计算机视觉和机器人许多问题的标准解决方案,但获得足够高质量的充分培训数据具有挑战性,因为人类劳动容易出错、耗时和昂贵。 近年来,基于模拟的解决方案越来越受欢迎,但模拟与现实之间的差距仍然是一个重大问题。 在本文中,我们引入了一种新的方法,通过不受监督的图像到图像翻译,将真实世界图像的风格应用于使用开放源框架的模拟图像。 生成的数据集与常规增强方法相结合,然后被应用于实时运行于自主足球机器人的神经网络模型。我们的评估显示,与完全在模拟中生成的图像所培训的模型相比,我们有了显著的改进。