We study the problem of learning conditional average treatment effects (CATE) from high-dimensional, observational data with unobserved confounders. Unobserved confounders introduce ignorance -- a level of unidentifiability -- about an individual's response to treatment by inducing bias in CATE estimates. We present a new parametric interval estimator suited for high-dimensional data, that estimates a range of possible CATE values when given a predefined bound on the level of hidden confounding. Further, previous interval estimators do not account for ignorance about the CATE associated with samples that may be underrepresented in the original study, or samples that violate the overlap assumption. Our interval estimator also incorporates model uncertainty so that practitioners can be made aware of out-of-distribution data. We prove that our estimator converges to tight bounds on CATE when there may be unobserved confounding, and assess it using semi-synthetic, high-dimensional datasets.
翻译:我们研究从未观察到的混淆分子的高维观测数据中学习有条件平均治疗效果(CATE)的问题。未观察到的困惑者通过在CATE估计中产生偏差,对一个人的治疗反应产生无知 -- -- 一种不可辨识的程度。我们提出了一个适合高维数据的新的参数间距估计器,在给于隐藏混结程度预先界定的界限时估计出一系列可能的CATE值。此外,以前的间距估计器没有考虑到对与原研究中可能代表不足的样品或违反重叠假设的样品相关的CATE有关的CATE的无知。我们的间距估计器还包含模型不确定性,以便让执行人员了解分配以外的数据。我们证明我们的估计器在可能无法观测到粘结时会与CATE的紧界相交汇,并且使用半合成的高维数据集来评估它。