A recent spate of state-of-the-art semi- and un-supervised solutions disentangle and encode image "content" into a spatial tensor and image appearance or "style" into a vector, to achieve good performance in spatially equivariant tasks (e.g. image-to-image translation). To achieve this, they employ different model design, learning objective, and data biases. While considerable effort has been made to measure disentanglement in vector representations, and assess its impact on task performance, such analysis for (spatial) content - style disentanglement is lacking. In this paper, we conduct an empirical study to investigate the role of different biases in content-style disentanglement settings and unveil the relationship between the degree of disentanglement and task performance. In particular, we consider the setting where we: (i) identify key design choices and learning constraints for three popular content-style disentanglement models; (ii) relax or remove such constraints in an ablation fashion; and (iii) use two metrics to measure the degree of disentanglement and assess its effect on each task performance. Our experiments reveal that there is a "sweet spot" between disentanglement, task performance and - surprisingly - content interpretability, suggesting that blindly forcing for higher disentanglement can hurt model performance and content factors semanticness. Our findings, as well as the used task-independent metrics, can be used to guide the design and selection of new models for tasks where content-style representations are useful.
翻译:最近出现了一系列最先进的半和不受监督的解决方案,将图像“内容”分解并编码成空间拉度和图像外观或向量“风格”以在空间等同任务(如图像到任务翻译)中取得良好表现。为了实现这一点,它们采用了不同的模型设计、学习目标和数据偏差。虽然作出了相当大的努力,以衡量矢量表达中的分解,并评估其对任务业绩的影响,但缺乏对(空间)内容 - 风格分解的分析。在本文件中,我们进行了一项实证研究,以调查不同偏差在内容式分解环境中的作用,或将“风格”转换成向量向量向量,以揭示在空间差异程度和任务绩效之间的关联性关系。特别是,我们考虑了以下背景:(一) 确定三种流行内容类型分错乱模式的关键设计选择和学习限制;(二) 以淡化方式放松或消除这种制约;以及(三) 使用两种度指标衡量(空间) 风格分解度和风格分解的选) 。在本文中,我们用的是,我们所展示的分解性内容和分解性任务时,我们的工作是“我们所使用的,我们所展示的分解性任务—— 我们的实验是用于的分解性任务的分解性任务。