Current state-of-the-art segmentation techniques for ocular images are critically dependent on large-scale annotated datasets, which are labor-intensive to gather and often raise privacy concerns. In this paper, we present a novel framework, called BiOcularGAN, capable of generating synthetic large-scale datasets of photorealistic (visible light and near infrared) ocular images, together with corresponding segmentation labels to address these issues. At its core, the framework relies on a novel Dual-Branch StyleGAN2 (DB-StyleGAN2) model that facilitates bimodal image generation, and a Semantic Mask Generator (SMG) that produces semantic annotations by exploiting DB-StyleGAN2's feature space. We evaluate BiOcularGAN through extensive experiments across five diverse ocular datasets and analyze the effects of bimodal data generation on image quality and the produced annotations. Our experimental results show that BiOcularGAN is able to produce high-quality matching bimodal images and annotations (with minimal manual intervention) that can be used to train highly competitive (deep) segmentation models that perform well across multiple real-world datasets. The source code will be made publicly available.
翻译:在本文中,我们提出了一个新颖的框架,称为BiOcelGAN,能够生成光现实(可见光和近红外线)眼镜图像的大规模合成数据集,并配有相应的分解标签,以解决这些问题。就其核心而言,框架依赖于一个具有双重结构风格GAN2(DB-StyleGAN2)(DB-StyleGAN2)(便利双式图像生成的新型双式图像和说明)(DB-StyleGAN2)(DB-StyleGAN2)(DB-StyleGAN2)模型,以及一个能通过利用DB-StyleGAN2的地貌空间产生语义性说明的语义性生成器。我们通过五个不同的视觉数据集的广泛实验,对双式数据生成对图像质量和制作说明的影响进行了评估。我们的实验结果表明,BiOclocalGAN能够产生高质量的匹配双式图像和说明(和说明,并采用最低限度的手干预),可以用于培训具有高度竞争性的多层次数据源模型。