The empirical measure resulting from the nearest neighbors to a given point - \textit{the nearest neighbor measure} - is introduced and studied as a central statistical quantity. First, the associated empirical process is shown to satisfy a uniform central limit theorem under a (local) bracketing entropy condition on the underlying class of functions (reflecting the localizing nature of the nearest neighbor algorithm). Second a uniform non-asymptotic bound is established under a well-known condition, often referred to as Vapnik-Chervonenkis, on the uniform entropy numbers. The covariance of the Gaussian limit obtained in the uniform central limit theorem is equal to the conditional covariance operator (given the point of interest). This suggests the possibility of extending standard approaches - non local - replacing simply the standard empirical measure by the nearest neighbor measure while using the same way of making inference but with the nearest neighbors only instead of the full data.
翻译:摘要:引入并研究了作为统计量的给定点的最近邻点——“最近邻测度”。首先,在底层函数类上满足局部万有引理熵条件时,相应的经验过程被证明满足一致中心极限定理(反映了最近邻算法的本地化性质)。其次,在均匀熵数上满足著名的Vapnik-Chervonenkis条件时,建立了一种统一的非漸进界限。在均匀中心极限定理中获得的高斯极限的协方差等于条件协方差运算符(鉴于感兴趣的点)。这表明了一种可能性,即将标准方法扩展到非局部应用,仅使用最近邻点而不是完整的数据进行推断,但方式相同。