The duality of content and style is inherent to the nature of art. For humans, these two elements are clearly different: content refers to the objects and concepts in the piece of art, and style to the way it is expressed. This duality poses an important challenge for computer vision. The visual appearance of objects and concepts is modulated by the style that may reflect the author's emotions, social trends, artistic movement, etc., and their deep comprehension undoubtfully requires to handle both. A promising step towards a general paradigm for art analysis is to disentangle content and style, whereas relying on human annotations to cull a single aspect of artworks has limitations in learning semantic concepts and the visual appearance of paintings. We thus present GOYA, a method that distills the artistic knowledge captured in a recent generative model to disentangle content and style. Experiments show that synthetically generated images sufficiently serve as a proxy of the real distribution of artworks, allowing GOYA to separately represent the two elements of art while keeping more information than existing methods.
翻译:内容与风格的二元性是艺术的本质特征。对于人类而言,这两个因素是不同的:内容指艺术作品中的物体和概念,而风格则指它的表现方式。这种二元性对于计算机视觉来说是一个重要的挑战。物体和概念的视觉外观受到风格的调制,而风格则可能反映作者的情感、社会趋势、艺术运动等,它们的深度理解无疑需要同时处理二者。一个通向艺术分析通用模式的有前途的步骤是解耦内容和风格,而仅依靠人类注释来选择艺术作品的单一方面在学习语义概念和绘画的视觉外貌方面存在局限性。因此,我们提出了GOYA,一种将最近的生成模型所捕捉的艺术知识提炼出来,以解耦内容和风格的方法。实验表明,合成生成的图像足以作为真实艺术作品分布的代理,使GOYA能够独立地表示艺术的两个要素,并保留比现有方法更多的信息。