Despite remarkable recent progress in image translation, the complex scene with multiple discrepant objects remains a challenging problem. Because the translated images have low fidelity and tiny objects in fewer details and obtain unsatisfactory performance in object recognition. Without the thorough object perception (i.e., bounding boxes, categories, and masks) of the image as prior knowledge, the style transformation of each object will be difficult to track in the image translation process. We propose panoptic-based object style-align generative adversarial networks (POSA-GANs) for image-to-image translation together with a compact panoptic segmentation dataset. The panoptic segmentation model is utilized to extract panoptic-level perception (i.e., overlap-removed foreground object instances and background semantic regions in the image). This is utilized to guide the alignment between the object content codes of the input domain image and object style codes sampled from the style space of the target domain. The style-aligned object representations are further transformed to obtain precise boundaries layout for higher fidelity object generation. The proposed method was systematically compared with different competing methods and obtained significant improvement on both image quality and object recognition performance for translated images.
翻译:尽管在图像翻译方面最近取得了显著进展,但多相异天体的复杂场景仍是一个棘手的问题。 因为翻译的图像的忠实度低,细节较少,微小天体,在目标识别方面表现不尽人意。 没有图像的彻底对象感知(即,捆绑框、类别和遮罩),每个天体的风格转换将难以在图像翻译过程中跟踪。 我们提议使用基于全光的物体样式对等色对称对抗网络(POSA-GANs),与一个紧凑的全光谱分割数据集一起进行图像对映化。 光谱分割模型用于提取全光级感知( 即, 图像中的重叠- 移除前地物体实例和背景语义区域) 。 用于指导输入域图像的物体内容代码和从目标域风格空间取样的物体样式代码之间的对齐。 风格对对象表示进一步转变,以获得更高正统对象生成的精确边界布局。 拟议的方法与不同的竞合方法进行了系统比较,并取得了图像质量和性辨识的显著改进。