Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images. Though the results are astonishing to human eyes, how applicable these generated images are for recognition tasks remains under-explored. In this work, we extensively study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks, and focus on two perspectives: synthetic data for improving classification models in data-scarce settings (i.e. zero-shot and few-shot), and synthetic data for large-scale model pre-training for transfer learning. We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks. Code: https://github.com/CVMI-Lab/SyntheticData.
翻译:近期的文本到图像生成模型在生成高贞洁光化现实图像方面显示了令人乐观的结果。尽管这些结果对人类来说令人吃惊,但这些生成的图像如何适用于识别任务,仍然没有得到充分探讨。在这项工作中,我们广泛研究是否以及如何将最新最先进的文本到图像生成模型产生的合成图像用于图像识别任务,并侧重于两个视角:用于改进数据碎裂环境分类模型的合成数据(即零射和几射),以及用于大规模模型转让学习前培训的合成数据。我们展示了现有基因化模型合成数据的力量和缺陷,并提出了更好地将合成数据用于识别任务的战略。代码:https://github./CVMI-Lab/SyntheticData。