因果关系和可通用性:可识别性和学习方法 (Causality and Generalizability: Identifiability and Learning Methods)

This PhD thesis contains several contributions to the field of statistical causal modeling. Statistical causal models are statistical models embedded with causal assumptions that allow for the inference and reasoning about the behavior of stochastic systems affected by external manipulation (interventions). This thesis contributes to the research areas concerning the estimation of causal effects, causal structure learning, and distributionally robust (out-of-distribution generalizing) prediction methods. We present novel and consistent linear and non-linear causal effects estimators in instrumental variable settings that employ data-dependent mean squared prediction error regularization. Our proposed estimators show, in certain settings, mean squared error improvements compared to both canonical and state-of-the-art estimators. We show that recent research on distributionally robust prediction methods has connections to well-studied estimators from econometrics. This connection leads us to prove that general K-class estimators possess distributional robustness properties. We, furthermore, propose a general framework for distributional robustness with respect to intervention-induced distributions. In this framework, we derive sufficient conditions for the identifiability of distributionally robust prediction methods and present impossibility results that show the necessity of several of these conditions. We present a new structure learning method applicable in additive noise models with directed trees as causal graphs. We prove consistency in a vanishing identifiability setup and provide a method for testing substructure hypotheses with asymptotic family-wise error control that remains valid post-selection. Finally, we present heuristic ideas for learning summary graphs of nonlinear time-series models.

翻译：本博士论文包含对统计因果建模领域的若干贡献。统计因果模型是包含因果假设的统计模型,可以推断和推理受外部操纵(干预)影响的随机系统的行为。本博士论文有助于估计因果效应、因果结构学习和分布稳健(分布不全)的预测方法的研究领域。我们在使用数据依赖型平均平方预测错误正规化的工具变量设置中提出了新颖和一致的线性和非线性因果估计器。我们提议的估算器在某些环境中显示,相对于罐头和最新设计后估计器而言,平均和推算系统的行为都有明显的平方差改进。我们表明,最近关于分布稳健的预测方法的研究与经过仔细研究的生态计量的估算师有关。我们证明,一般K级估测器拥有分配稳健的因果特性。我们提议了一个用于分配型平面预测错误分布稳健度的框架。在这个框架内,我们为当前真实的准确性估算性结构的准确性结构中,我们提出了充分的条件,以可靠的方式展示了可靠性结构的逻辑结构。