The problem of inferring the direct causal parents of a response variable among a large set of explanatory variables is of high practical importance in many disciplines. However, established approaches often scale at least exponentially with the number of explanatory variables, are difficult to extend to nonlinear relationships, and are difficult to extend to cyclic data. Inspired by {\em Debiased} machine learning methods, we study a one-vs.-the-rest feature selection approach to discover the direct causal parent of the response. We propose an algorithm that works for purely observational data while also offering theoretical guarantees, including the case of partially nonlinear relationships possibly under the presence of cycles. As it requires only one estimation for each variable, our approach is applicable even to large graphs. We demonstrate significant improvements compared to established approaches.
翻译:在许多学科中,在大量解释性变数中推断反应变量的直接因果母的问题具有高度的实际重要性,但是,既定办法往往与解释性变数的数量至少成倍地扩大,难以扩大到非线性关系,也难以扩大到周期性数据。在“推移”机器学习方法的启发下,我们研究一夫一妻制特征选择方法,以发现反应的直接因果母。我们建议一种算法,用于纯粹的观察性数据,同时提供理论保证,包括部分非线性关系的情况,可能存在周期。由于每个变数只需要一个估计,我们的方法甚至适用于大图表。我们展示了与既定方法相比的重大改进。