Despite remarkable progress in image translation, the complex scene with multiple discrepant objects remains a challenging problem. The translated images have low fidelity and tiny objects in fewer details causing unsatisfactory performance in object recognition. Without thorough object perception (i.e., bounding boxes, categories, and masks) of images as prior knowledge, the style transformation of each object will be difficult to track in translation. We propose panoptic-aware generative adversarial networks (PanopticGAN) for image-to-image translation together with a compact panoptic segmentation dataset. The panoptic perception (i.e., foreground instances and background semantics of the image scene) is extracted to achieve alignment between object content codes of the input domain and panoptic-level style codes sampled from the target style space, then refined by a proposed feature masking module for sharping object boundaries. The image-level combination between content and sampled style codes is also merged for higher fidelity image generation. Our proposed method was systematically compared with different competing methods and obtained significant improvement in both image quality and object recognition performance.
翻译:尽管在图像翻译方面取得了显著进展,但多相异天体的复杂场景仍是一个棘手的问题。翻译图像的忠诚度低,细微对象较少,导致物体识别的性能不尽人意。如果没有作为先前知识对图像进行彻底的物体感知(即捆绑框、类别和面罩),每个天体的风格转换将难以在翻译中跟踪。我们提议将图像到图像的图像配对式对立网络(PanoptopGAN)与一个紧凑的全光谱分割数据集合并在一起进行图像到图像转换。光学感知(即图像场的表面实例和背景语义)将进行提取,以实现输入域的物体内容代码与从目标样式空间取样的光学级风格代码之间的一致,然后通过一个拟议的尖锐物体边界特征遮掩模模块加以完善。内容和抽样风格代码之间的图像级组合也是为了产生更高的真实性图像生成。我们提出的方法与不同的竞合方法进行了系统比较,并在图像质量和对象识别性功能方面都取得了显著的改进。