The process of generating data such as images is controlled by independent and unknown factors of variation. The retrieval of these variables has been studied extensively in the disentanglement, causal representation learning, and independent component analysis fields. Recently, approaches merging these domains together have shown great success. Instead of directly representing the factors of variation, the problem of disentanglement can be seen as finding the interventions on one image that yield a change to a single factor. Following this assumption, we introduce a new method for disentanglement inspired by causal dynamics that combines causality theory with vector-quantized variational autoencoders. Our model considers the quantized vectors as causal variables and links them in a causal graph. It performs causal interventions on the graph and generates atomic transitions affecting a unique factor of variation in the image. We also introduce a new task of action retrieval that consists of finding the action responsible for the transition between two images. We test our method on standard synthetic and real-world disentanglement datasets. We show that it can effectively disentangle the factors of variation and perform precise interventions on high-level semantic attributes of an image without affecting its quality, even with imbalanced data distributions.
翻译:生成图像等数据的过程由独立和未知的变异因素控制。 这些变量的检索已经在分解、因果代言学习和独立元件分析领域进行了广泛研究。 最近, 将这些领域合并的做法显示了巨大的成功。 分解问题不直接代表变异因素, 也可以被视为在一种图像上找到干预, 导致改变为单一因素。 根据这一假设, 我们引入了一种新的分解方法, 由因果动力驱动, 将因果关系理论与矢量定量的变异变异器结合起来。 我们模型将量化的矢量视为因果变量, 并将它们连接在因果图中。 它在图形上进行因果性干预, 并产生原子转换, 影响图像中独特的变异因素。 我们还引入了一个新的行动检索任务, 包括找到两种图像之间转换的动因。 我们测试了标准的合成和真实世界解动数据元件的方法。 我们显示, 它可以有效地分解变因素, 并在不影响图像质量的情况下对高层次的内分解特性进行精确的干预。</s>