Synthetic-to-real transfer learning is a framework in which we pre-train models with synthetically generated images and ground-truth annotations for real tasks. Although synthetic images overcome the data scarcity issue, it remains unclear how the fine-tuning performance scales with pre-trained models, especially in terms of pre-training data size. In this study, we collect a number of empirical observations and uncover the secret. Through experiments, we observe a simple and general scaling law that consistently describes learning curves in various tasks, models, and complexities of synthesized pre-training data. Further, we develop a theory of transfer learning for a simplified scenario and confirm that the derived generalization bound is consistent with our empirical findings.
翻译:合成到实际转让学习是一个框架,我们在这个框架内,先用合成生成的图像和地面实况说明为实际任务进行预培训模型,虽然合成图像克服了数据稀缺问题,但尚不清楚如何用预培训模型微调性能尺度,特别是培训前数据大小;在本研究中,我们收集了一些经验观测结果并揭示了秘密;通过实验,我们观察到一项简单和一般的尺度法,它一贯描述各种任务、模型的学习曲线和综合培训前数据的复杂程度;此外,我们为简化的情景发展了一种转移学习理论,并确认衍生的概括性约束符合我们的经验结论。