Visual cognition of primates is superior to that of artificial neural networks in its ability to 'envision' a visual object, even a newly-introduced one, in different attributes including pose, position, color, texture, etc. To aid neural networks to envision objects with different attributes, we propose a family of objective functions, expressed on groups of examples, as a novel learning framework that we term Group-Supervised Learning (GSL). GSL allows us to decompose inputs into a disentangled representation with swappable components, that can be recombined to synthesize new samples. For instance, images of red boats & blue cars can be decomposed and recombined to synthesize novel images of red cars. We propose an implementation based on auto-encoder, termed group-supervised zero-shot synthesis network (GZS-Net) trained with our learning framework, that can produce a high-quality red car even if no such example is witnessed during training. We test our model and learning framework on existing benchmarks, in addition to anew dataset that we open-source. We qualitatively and quantitatively demonstrate that GZS-Net trained with GSL outperforms state-of-the-art methods.
翻译:灵长类的视觉认知优于人工神经网络的视觉认知,因为它能够“视觉”一个视觉物体,甚至新生成的视觉物体,具有不同的属性,包括姿势、位置、颜色、质素等。 为了帮助神经网络来设想具有不同属性的物体,我们提议了一个客观功能的组合,以各种例子的形式表达,作为我们称为“集团-监督学习(GSL)”的新学习框架。 GSL允许我们将输入分解成一个与可互换的组件分解的表达式,这些组件可以与可互换的部件重新合成新的样品。例如,红船和蓝色汽车的图像可以解组合和再结合,以合成红色汽车的新图像。我们提议了一个基于自动编码的、称为群体超超零光合成网络(GZS-Net)的系统,通过我们的学习框架来培训,可以产生高质量的红色汽车,即使培训过程中没有这种例子。我们测试了我们关于现有基准的模型和学习框架,除了我们开源的新数据设置之外,还可以进行再合成。我们从质量和定量上证明GS-Z-S-S-stststst 演示了GZ-s-form-s-forg-s-s-d-dg-d-s-s-smag-s-s-dgrog-s-d-d-d-d-dg-s-d-d-s-d-d-smadg-s-s-s-s-s-dg-s-s-s-s-dg-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-d-s-s-s-s-s-d-s-s-s-s-s-s-s-s-s-s-s-s-d-d-d-d-s-s-d-d-s-s-s-s-d-d-s-s-s-s-d-d-s-s-d-s-s-d-s-s-s-s-s-d-s-s-d-d-s-s-d-s-s-s-s-s-d-d-