In the regression framework, the empirical measure based on the responses resulting from the nearest neighbors, among the covariates, to a given point $x$ is introduced and studied as a central statistical quantity. First, the associated empirical process is shown to satisfy a uniform central limit theorem under a local bracketing entropy condition on the underlying class of functions reflecting the localizing nature of the nearest neighbor algorithm. Second a uniform non-asymptotic bound is established under a well-known condition, often referred to as Vapnik-Chervonenkis, on the uniform entropy numbers. The covariance of the Gaussian limit obtained in the uniform central limit theorem is simply equal to the conditional covariance operator given the covariate value. This suggests the possibility of using standard formulas to estimate the variance by using only the nearest neighbors instead of the full data. This is illustrated on two problems: the estimation of the conditional cumulative distribution function and local linear regression.
翻译:暂无翻译