Synthetic images created by generative models increase in quality and expressiveness as newer models utilize larger datasets and novel architectures. Although this photorealism is a positive side-effect from a creative standpoint, it becomes problematic when such generative models are used for impersonation without consent. Most of these approaches are built on the partial transfer between source and target pairs, or they generate completely new samples based on an ideal distribution, still resembling the closest real sample in the dataset. We propose MixSyn (read as " mixin' ") for learning novel fuzzy compositions from multiple sources and creating novel images as a mix of image regions corresponding to the compositions. MixSyn not only combines uncorrelated regions from multiple source masks into a coherent semantic composition, but also generates mask-aware high quality reconstructions of non-existing images. We compare MixSyn to state-of-the-art single-source sequential generation and collage generation approaches in terms of quality, diversity, realism, and expressive power; while also showcasing interactive synthesis, mix & match, and edit propagation tasks, with no mask dependency.
翻译:由基因模型创造的合成图像在质量和表达性方面增加,因为新模型使用更大的数据集和新结构。虽然这种光现实主义是一个积极的副作用,但从创造性的观点看,它是一个积极的副作用,当这种基因模型未经同意而用于假冒时,它就会成为问题。这些方法大多建在源和目标配对之间的部分传输上,或者根据理想分布产生全新的全新样本,仍然与数据集中最接近的真正样本相仿。我们建议混合合成(称为“混合”),以便从多个来源学习新的模糊成份,并创建与构成相匹配的图像区域组合。混合合成不仅将多种来源的与非相干区域合并成连贯的语义构成,而且还能生成对非存在的图像进行高质量的蒙面重建。我们将MixSyn与最先进的单源连续生成和组合生成方法进行比较,在质量、多样性、现实主义和表达力方面,同时展示互动合成、混合和匹配、依赖性以及修改任务,不带有遮罩。