Given the joint distribution of two random variables $X,Y$ on some second countable locally compact Hausdorff space, we investigate the statistical approximation of the $L^2$-operator defined by $[Pf](x) := \mathbb{E}[ f(Y) \mid X = x ]$ under minimal assumptions. By modifying its domain, we prove that $P$ can be arbitrarily well approximated in operator norm by Hilbert--Schmidt operators acting on a reproducing kernel Hilbert space. This fact allows to estimate $P$ uniformly by finite-rank operators over a dense subspace even when $P$ is not compact. In terms of modes of convergence, we thereby obtain the superiority of kernel-based techniques over classically used parametric projection approaches such as Galerkin methods. This also provides a novel perspective on which limiting object the nonparametric estimate of $P$ converges to. As an application, we show that these results are particularly important for a large family of spectral analysis techniques for Markov transition operators. Our investigation also gives a new asymptotic perspective on the so-called kernel conditional mean embedding, which is the theoretical foundation of a wide variety of techniques in kernel-based nonparametric inference.
翻译:鉴于两个随机变量($X,Y$美元)在Hausdorff 空间的第二个可计算本地紧凑压缩空间上的联合分配,我们调查了$L$2美元操作器的统计近似值($Pf) (x) : =\mathbb{E}[f(Y)\mid X=x]][f(f(Y)\mid X=x]] 在最低假设下,[F(Y)\mid X=x]][f[f) [f(F(Y)\mid X=x]][在最低假设下, 。通过修改域,我们证明Hilbert-Schmidt操作员在复制核心Hilbert 空间上的行为操作员的操作员标准中可以任意地非常接近$P$。 这一事实使得有限操作员在密集的亚空间上统一估计$P$美元, 即使当$P$不是紧凑时, 。 在趋同模式方面, 我们的调查也从以核心技术的新的理论基础上, 提供了一种基础。