This paper studies causal representation learning problem when the latent causal variables are observed indirectly through an unknown linear transformation. The objectives are: (i) recovering the unknown linear transformation (up to scaling and ordering), and (ii) determining the directed acyclic graph (DAG) underlying the latent variables. Since identifiable representation learning is impossible based on only observational data, this paper uses both observational and interventional data. The interventional data is generated under distinct single-node randomized hard and soft interventions. These interventions are assumed to cover all nodes in the latent space. It is established that the latent DAG structure can be recovered under soft randomized interventions via the following two steps. First, a set of transformation candidates is formed by including all inverting transformations corresponding to which the \emph{score} function of the transformed variables has the minimal number of coordinates that change between an interventional and the observational environment summed over all pairs. Subsequently, this set is distilled using a simple constraint to recover the latent DAG structure. For the special case of hard randomized interventions, with an additional hypothesis testing step, one can also uniquely recover the linear transformation, up to scaling and a valid causal ordering. These results generalize the recent results that either assume deterministic hard interventions or linear causal relationships in the latent space.
翻译:本文研究通过未知线性变迁间接观测潜在因果变异时的因果关系学习问题,目标是:(一) 恢复未知的线性变换(升至缩放和订购),以及(二) 确定潜在变异背后的直线变换(DAG),因为仅仅根据观测数据不可能进行可识别的代谢学习,本文使用观测和干预数据。干预数据是在不同的单节随机软干预下产生的。这些干预假设覆盖了潜伏空间的所有节点。确定潜伏的DAG结构可以通过以下两个步骤在软随机干预下恢复。首先,一组变异候选者通过包括所有反转变(与变变变量的emph{score}功能相对应的),从而形成最小的坐标,即干预环境与观察环境之间所有对等的变化。随后,这一组合用简单的制约来淡化,以恢复潜伏的DAG结构。对于硬随机干预的特殊案例,通过额外的假设测试步骤,可以形成一套变异的变换组合,也可以将直线性变化和直线性结果重新确定。