In representation learning, there has been recent interest in developing algorithms to disentangle the ground-truth generative factors behind a dataset, and metrics to quantify how fully this occurs. However, these algorithms and metrics often assume that both representations and ground-truth factors are flat, continuous, and factorized, whereas many real-world generative processes involve rich hierarchical structure, mixtures of discrete and continuous variables with dependence between them, and even varying intrinsic dimensionality. In this work, we develop benchmarks, algorithms, and metrics for learning such hierarchical representations.
翻译:在代表性学习方面,最近人们有兴趣发展各种算法,以解开数据集背后的地面真象变异因素,以及量化这种变异的充分程度的衡量标准。 但是,这些算法和衡量标准往往假定,表象和地面真象因素都是平坦的、持续的和因数化的,而许多现实世界的基因化过程涉及丰富的等级结构、离散的和持续变数的混合,它们彼此依赖,甚至各不相同的内在维度。 在这项工作中,我们为学习这种等级表征制定基准、算法和衡量标准。