Realistic synthetic image data rendered from 3D models can be used to augment image sets and train image classification semantic segmentation models. In this work, we explore how high quality physically-based rendering and domain randomization can efficiently create a large synthetic dataset based on production 3D CAD models of a real vehicle. We use this dataset to quantify the effectiveness of synthetic augmentation using U-net and Double-U-net models. We found that, for this domain, synthetic images were an effective technique for augmenting limited sets of real training data. We observed that models trained on purely synthetic images had a very low mean prediction IoU on real validation images. We also observed that adding even very small amounts of real images to a synthetic dataset greatly improved accuracy, and that models trained on datasets augmented with synthetic images were more accurate than those trained on real images alone. Finally, we found that in use cases that benefit from incremental training or model specialization, pretraining a base model on synthetic images provided a sizeable reduction in the training cost of transfer learning, allowing up to 90\% of the model training to be front-loaded.
翻译:从 3D 模型 获得的现实合成图像数据可以用来扩大图像集,并培训图像分类语义分化模型。在这项工作中,我们探讨了高质量的物理成像和域随机化如何能有效地在生产 3D CAD 模型的基础上建立一个大型合成数据集;我们利用该数据集来量化使用 U-net 和 双U-U-net 模型合成增强的效能。我们发现,在这方面,合成图像是增加数量有限的真正培训数据的有效技术。我们观察到,在纯合成图像方面受过培训的模型在真实验证图像上对IOU进行了非常低的中值预测。我们还注意到,在合成数据集中添加了非常小数量的真实图像,而以合成图像增强的模型比仅仅在真实图像上受过培训的模型更准确。最后,我们发现,在使用从渐进培训或模型专业化中受益的案例中,合成图像基础模型的预先培训使转让学习成本大大降低,允许将模型培训的多达90 % 。