We study optimal procedures for estimating a linear functional based on observational data. In many problems of this kind, a widely used assumption is strict overlap, i.e., uniform boundedness of the importance ratio, which measures how well the observational data covers the directions of interest. When it is violated, the classical semi-parametric efficiency bound can easily become infinite, so that the instance-optimal risk depends on the function class used to model the regression function. For any convex and symmetric function class $\mathcal{F}$, we derive a non-asymptotic local minimax bound on the mean-squared error in estimating a broad class of linear functionals. This lower bound refines the classical semi-parametric one, and makes connections to moduli of continuity in functional estimation. When $\mathcal{F}$ is a reproducing kernel Hilbert space, we prove that this lower bound can be achieved up to a constant factor by analyzing a computationally simple regression estimator. We apply our general results to various families of examples, thereby uncovering a spectrum of rates that interpolate between the classical theories of semi-parametric efficiency (with $\sqrt{n}$-consistency) and the slower minimax rates associated with non-parametric function estimation.
翻译:我们根据观测数据研究估算线性功能的最佳程序。 在这类许多问题中,一个广泛使用的假设是严格的重叠,即,重要性比的统一界限,用来衡量观测数据覆盖利益方向的准确性。如果被违反,典型的半参数效率约束很容易变得无限,因此实例最佳风险取决于用于模拟回归功能的功能类别。对于任何 convex 和对称函数类别$\mathcal{F}美元,我们广泛使用的假设是严格重叠,即,在估计广泛的线性功能类别时,衡量观测数据在多大程度上覆盖了利益方向。当典型的半参数效率约束被违反时,典型半参数效率约束会很容易变得无限,因此,当例优度风险取决于用于模拟回归功能的功能类别时,我们通过分析一个计算简单的回归估计算法来证明这一较低约束可以达到一个恒定因素。我们将我们的一般结果套在各种例子中,从而勾勒出经典半参数的半参数范围,从而勾勒出一个与最慢的正数率之间的模型。