Synthetic data offers the promise of cheap and bountiful training data for settings where lots of labeled real-world data for tasks is unavailable. However, models trained on synthetic data significantly underperform on real-world data. In this paper, we propose Proportional Amplitude Spectrum Training Augmentation (PASTA), a simple and effective augmentation strategy to improve out-of-the-box synthetic-to-real (syn-to-real) generalization performance. PASTA involves perturbing the amplitude spectrums of the synthetic images in the Fourier domain to generate augmented views. We design PASTA to perturb the amplitude spectrums in a structured manner such that high-frequency components are perturbed relatively more than the low-frequency ones. For the tasks of semantic segmentation (GTAV to Real), object detection (Sim10K to Real), and object recognition (VisDA-C Syn to Real), across a total of 5 syn-to-real shifts, we find that PASTA outperforms more complex state-of-the-art generalization methods while being complementary to the same.
翻译:合成数据为无法获得大量被贴上标签的真实世界数据的任务环境提供了廉价和丰富的培训数据的前景。然而,在合成数据方面受过培训的模型在真实世界数据方面表现严重不足。在本文件中,我们提议了比例振幅光谱培训增强(PASTA),这是一个简单而有效的增强战略,目的是提高合成到现实(同步到现实)的外装箱一般化性能。 PASTA涉及对Fourier域合成图像的振动频谱进行扰动,以产生更多的视图。我们设计了PASTA,以结构化的方式对振动频谱进行扰动,使高频组件的渗透比低频率的更深。对于语义分解(GTAV to Real)、物体检测(Sim10K to real)和物体识别(VisDA-C Syn toReal)的任务,我们发现,在总共5个同步到真实的变换时,我们发现SAPTA是比更复杂的状态一般化方法更复杂,同时作为同一种补充。