The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures. We analyze a broad class of two-stage procedures that first estimate the treatment effect function, and then use this quantity to estimate the linear functional. We prove non-asymptotic upper bounds on the mean-squared error of such procedures: these bounds reveal that in order to obtain non-asymptotically optimal procedures, the error in estimating the treatment effect should be minimized in a certain weighted $L^2$-norm. We analyze a two-stage procedure based on constrained regression in this weighted norm, and establish its instance-dependent optimality in finite samples via matching non-asymptotic local minimax lower bounds. These results show that the optimal non-asymptotic risk, in addition to depending on the asymptotically efficient variance, depends on the weighted norm distance between the true outcome function and its approximation by the richest function class supported by the sample size.
翻译:根据观测数据估计线性功能的问题在因果推断和土匪文献中都是有说服力的。我们分析一大批两阶段程序,先先估计处理效果功能,然后用这个数量来估计线性功能。我们证明,这种程序中平均差错的上限不是无药可治的:这些界限显示,为了获得非现药最佳程序,估计处理效果时的误差应在某种加权的$L ⁇ 2美元-诺尔姆中最小化。我们分析基于这一加权规范中受限回归的两阶段程序,并通过匹配非无药可治的本地微量下限,在有限的样本中确立其以实例为依据的最佳性。这些结果显示,最佳的非现药风险,除了取决于无药可治效率的差异外,还取决于真正结果功能与受抽样规模支持的最富功能类别之间的加权标准距离。