We study the problem of unsupervised discovery and segmentation of object parts, which, as an intermediate local representation, are capable of finding intrinsic object structure and providing more explainable recognition results. Recent unsupervised methods have greatly relaxed the dependency on annotated data which are costly to obtain, but still rely on additional information such as object segmentation mask or saliency map. To remove such a dependency and further improve the part segmentation performance, we develop a novel approach by disentangling the appearance and shape representations of object parts followed with reconstruction losses without using additional object mask information. To avoid degenerated solutions, a bottleneck block is designed to squeeze and expand the appearance representation, leading to a more effective disentanglement between geometry and appearance. Combined with a self-supervised part classification loss and an improved geometry concentration constraint, we can segment more consistent parts with semantic meanings. Comprehensive experiments on a wide variety of objects such as face, bird, and PASCAL VOC objects demonstrate the effectiveness of the proposed method.
翻译:我们研究未受监督的发现和分离物体部件的问题,作为中间的局部代表,这些物体部件能够找到内在的物体结构,并提供更能解释的识别结果。最近未经监督的方法大大地减轻了对附加说明的数据的依赖,这些数据是昂贵的,但是仍然依赖更多的资料,如物体分割面罩或突出的地图。为了消除这种依赖性并进一步改进部分分割性,我们开发了一种新颖的方法,将物体部件外形和形状的显示与重建损失相脱钩,而没有使用其他的物体掩码信息。为了避免退化的解决方案,设计了一个瓶颈块来挤压和扩大外观代表面部位,导致几何和外观之间更有效的分解。加上自我监督的部分分类损失以及改进的几何分化浓度限制,我们可以将部分与语义含义分开。对面、鸟和PASAL VOC物体等多种物体的全面实验证明了拟议方法的有效性。