We design the first multi-layer disentanglement metric operating at all hierarchy levels of a structured latent representation, and derive its theoretical properties. Applied to object-centric representations, our metric unifies the evaluation of both object separation between latent slots and internal slot disentanglement into a common mathematical framework. It also addresses the problematic dependence on segmentation mask sharpness of previous pixel-level segmentation metrics such as ARI. Perhaps surprisingly, our experimental results show that good ARI values do not guarantee a disentangled representation, and that the exclusive focus on this metric has led to counterproductive choices in some previous evaluations. As an additional technical contribution, we present a new algorithm for obtaining feature importances that handles slot permutation invariance in the representation.
翻译:我们设计了在结构化潜在代表结构的各个层次上运行的第一个多层次分解指标,并得出了其理论属性。应用到以物体为中心的表达方式,我们的衡量标准统一了对潜在空档和内部空档分离的物体区分的评估,形成一个共同的数学框架。它还解决了对分解掩盖像素级先前的分解指标(如ARI)的尖锐性的问题。也许令人惊讶的是,我们的实验结果表明,良好的 ARI 值并不能保证分解的表达方式,而对这一指标的专注导致在以往的一些评价中做出相反效果的选择。作为额外的技术贡献,我们提出了一种新的算法,以获得处理代表形式中变异性的位置的特征重要性。