Estimation of a conditional mean (linking a set of features to an outcome of interest) is a fundamental statistical task. While there is an appeal to flexible nonparametric procedures, effective estimation in many classical nonparametric function spaces (e.g., multivariate Sobolev spaces) can be prohibitively difficult -- both statistically and computationally -- especially when the number of features is large. In this paper, we present (penalized) sieve estimators for regression in nonparametric tensor product spaces: These spaces are more amenable to multivariate regression, and allow us to, in-part, avoid the curse of dimensionality. Our estimators can be easily applied to multivariate nonparametric problems and have appealing statistical and computational properties. Moreover, they can effectively leverage additional structures such as feature sparsity. In this manuscript, we give theoretical guarantees, indicating that the predictive performance of our estimators scale favorably in dimension. In addition, we also present numerical examples to compare the finite-sample performance of the proposed estimators with several popular machine learning methods.
翻译:测算一个有条件的中值(将一组特征与感兴趣的结果联系起来)是一项基本的统计任务。虽然对灵活的非参数程序有吸引力,但在许多古典非参数功能空间(例如多变量 Sobolev空间)中,有效估算可能极其困难,在统计和计算上都是如此,特别是在特征数量巨大的情况下。在本文中,我们提出了(可比较的)非对数强产品空间回归的筛选估计值:这些空间更适合多变量回归,并使我们能够部分地避免维度的诅咒。我们的估计值可以很容易地应用于多变量的非参数问题,并具有吸引力的统计和计算特性。此外,它们可以有效地利用特性宽度等额外结构。在本文中,我们提供了理论保证,表明我们估计器的预测性表现在维度上是有利的。此外,我们还提供了数字例子,以比较拟议的估计器的有限性能和几种流行机器学习方法。