Recent work leverages the expressive power of generative adversarial networks (GANs) to generate labeled synthetic datasets. These dataset generation methods often require new annotations of synthetic images, which forces practitioners to seek out annotators, curate a set of synthetic images, and ensure the quality of generated labels. We introduce the HandsOff framework, a technique capable of producing an unlimited number of synthetic images and corresponding labels after being trained on less than 50 pre-existing labeled images. Our framework avoids the practical drawbacks of prior work by unifying the field of GAN inversion with dataset generation. We generate datasets with rich pixel-wise labels in multiple challenging domains such as faces, cars, full-body human poses, and urban driving scenes. Our method achieves state-of-the-art performance in semantic segmentation, keypoint detection, and depth estimation compared to prior dataset generation approaches and transfer learning baselines. We additionally showcase its ability to address broad challenges in model development which stem from fixed, hand-annotated datasets, such as the long-tail problem in semantic segmentation.
翻译:最近的工作利用了基因对抗网络(GANs)的显像力来生成贴有标签的合成数据集。这些数据集生成方法往往需要合成图像的新说明,迫使从业者寻找批注者,制作一套合成图像,并确保所制作标签的质量。我们引入了“掌上工具”框架,这是一种在经过不到50个已有标签的图像的培训后,能够产生无限数量的合成图像和相应的标签的技术。我们的框架通过将GAN的域与数据集生成统一起来,避免了先前工作的实际缺陷。我们在面部、汽车、全体人脸部和城市驱动场等多个具有挑战性的领域生成了带有丰富的像素标签的数据集。我们的方法在语义分割、关键点探测和深度估计方面达到了最先进的性能,与先前的数据集生成方法相比,并转移了学习基线。我们还展示了它应对模型开发方面广泛挑战的能力,这些挑战来自固定的手图解数据集,例如语义分割中的长尾问题。