We examine interval estimation of the effect of a treatment T on an outcome Y given the existence of an unobserved confounder U. Using H\"older's inequality, we derive a set of bounds on the confounding bias |E[Y|T=t]-E[Y|do(T=t)]| based on the degree of unmeasured confounding (i.e., the strength of the connection U->T, and the strength of U->Y). These bounds are tight either when U is independent of T or when U is independent of Y given T (when there is no unobserved confounding). We focus on a special case of this bound depending on the total variation distance between the distributions p(U) and p(U|T=t), as well as the maximum (over all possible values of U) deviation of the conditional expected outcome E[Y|U=u,T=t] from the average expected outcome E[Y|T=t]. We discuss possible calibration strategies for this bound to get interval estimates for treatment effects, and experimentally validate the bound using synthetic and semi-synthetic datasets.
翻译:我们研究治疗T对结果Y的影响的间隔估计,因为有未观察到的混淆者U.使用H\'older的不平等,我们根据未测量的混杂程度(即连接U-T的强度和U-Y的强度),对治疗T对结果Y的影响进行定期估计。当U独立于T或U独立于Y给定的T(在没有未观察到的混杂的情况下)时,这些界限很紧。我们根据分布p(U)与p(U)和p(U)T=t)之间的总变差距离,以及条件性预期结果E[Y ⁇ =U的强度、T=t]相对于平均预期结果E[Y ⁇ T=t]的最大偏差(超过U的所有可能值),着重研究这一约束的特例。我们讨论了这一约束的可能校准战略,以获得治疗效果的间隔估计,并用合成和半同步数据对约束进行实验性验证。