When the regression function belongs to the smooth classes consisting of univariate functions with derivatives up to the $(\gamma+1)$th order bounded in absolute values by a common constant everywhere or a.e., it is generally viewed that exploiting higher degree smoothness assumption helps reduce the estimation error. This paper shows that the minimax optimal mean integrated squared error (MISE) rate increases in $\gamma$ when the sample size $n$ is small relative to $\left(\gamma+1\right)^{2\gamma+3}$ (e.g., $\left(\gamma+1\right)^{2\gamma+3}=262144$ when $\gamma=3$), and decreases in $\gamma$ when $n$ is large relative to $\left(\gamma+1\right)^{2\gamma+3}$. In particular, this phase transition property is shown to be achieved by common nonparametric procedures. Consider $\gamma_{1}$ and $\gamma_{2}$ such that $\gamma_{1}<\gamma_{2}$, where the $(\gamma_{2}+1)$th degree smoothness class is a subset of the $(\gamma_{1}+1)$th degree class. What is interesting about our results is that they imply, if $n$ is small relative to $\left(\gamma_{1}+1\right)^{2\gamma_{1}+3}$, the optimal rate achieved by the estimator constrained to be in the smoother class is larger. In data sets with fewer than hundreds-of-thousands observations, our results suggest that one should not exploit beyond the third degree of smoothness. To some extent, our results provide a theoretical basis for the widely adopted practical recommendation given by Gelman and Imbens (2019). The building blocks of our minimax optimality results are a set of metric entropy bounds we develop in this paper for smooth function classes. Some of our bounds are original, and some of them refine and/or generalize the ones in the literature.
翻译:当回归函数属于由纯值组成的平滑类, 包括以美元( gamma+1) 或 a.e. 。 当由纯值以绝对值约束的univariate 函数构成的平滑级由$( gamma+1) +1美元组成时, 通常会看到, 利用更高度的平滑假设有助于减少估算错误。 本文显示, 当样本大小为$( left) (\ gamma+1) /\\\\ right) =3 gamma+3+3美元时, 当样本大小小于$( gamma+1\\ gamma+3) 时, 当样本大小小于$( mamama) =1\\\\\\\ gamma+3} 美元时, 当样本大小小块( mamamamama) 和小块( mamamama) 等分数( =_\\\\\\\\\\\\\\\\ ma) max max maxal le maxal ma) y or max, 我们的原始程序显示的原始( ma) =xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx