标题：基于不变因果学习算法在观测数据上应用的一步摘要：机器学习可以通过因果发现进行解释，并通过因果推断进行泛化。在这一研究领域中，已经提出了一些基于不变关系学习的算法，用于正交分布(OOD)泛化。其中一些算法专注于因果发现，如受影响因果预测(ICP)，它找到了对有关变量的因果父项，而其他一些算法则直接提供了一个因果最优预测器，能很好地在OOD环境下进行泛化，如不变风险最小化(IRM)。这些算法群体的工作是基于在因果推断上进行干预的多个训练环境。观测数据和真实世界应用中通常不提供这些环境。在这里，我们提出了一种有效生成这些环境的方法。我们通过在模拟数据上实现ICP来评估这种无监督学习问题的性能。我们还展示了如何将ICP与我们的因果发现方法高效地集成。最后，我们提出了一个改进版本的方法，将ICP与多个协变量的数据集相结合。其中，ICP和其他因果发现方法通常会退化。 (A step towards the applicability of algorithms based on invariant causal learning on observational data)

翻译：标题：基于不变因果学习算法在观测数据上应用的一步摘要：机器学习可以通过因果发现进行解释，并通过因果推断进行泛化。在这一研究领域中，已经提出了一些基于不变关系学习的算法，用于正交分布(OOD)泛化。其中一些算法专注于因果发现，如受影响因果预测(ICP)，它找到了对有关变量的因果父项，而其他一些算法则直接提供了一个因果最优预测器，能很好地在OOD环境下进行泛化，如不变风险最小化(IRM)。这些算法群体的工作是基于在因果推断上进行干预的多个训练环境。观测数据和真实世界应用中通常不提供这些环境。在这里，我们提出了一种有效生成这些环境的方法。我们通过在模拟数据上实现ICP来评估这种无监督学习问题的性能。我们还展示了如何将ICP与我们的因果发现方法高效地集成。最后，我们提出了一个改进版本的方法，将ICP与多个协变量的数据集相结合。其中，ICP和其他因果发现方法通常会退化。

Borja Guerrero Santillan

Machine learning can benefit from causal discovery for interpretation and from causal inference for generalization. In this line of research, a few invariant learning algorithms for out-of-distribution (OOD) generalization have been proposed by using multiple training environments to find invariant relationships. Some of them are focused on causal discovery as Invariant Causal Prediction (ICP), which finds causal parents of a variable of interest, and some directly provide a causal optimal predictor that generalizes well in OOD environments as Invariant Risk Minimization (IRM). This group of algorithms works under the assumption of multiple environments that represent different interventions in the causal inference context. Those environments are not normally available when working with observational data and real-world applications. Here we propose a method to generate them in an efficient way. We assess the performance of this unsupervised learning problem by implementing ICP on simulated data. We also show how to apply ICP efficiently integrated with our method for causal discovery. Finally, we proposed an improved version of our method in combination with ICP for datasets with multiple covariates where ICP and other causal discovery methods normally degrade in performance.

翻译：