Overparametrization often helps improve the generalization performance. This paper proposes a dual view of overparametrization suggesting that downsampling may also help generalize. Motivated by this dual view, we characterize two out-of-sample prediction risks of the sketched ridgeless least square estimator in the proportional regime $m\asymp n \asymp p$, where $m$ is the sketching size, $n$ the sample size, and $p$ the feature dimensionality. Our results reveal the statistical role of downsampling. Specifically, downsampling does not always hurt the generalization performance, and may actually help improve it in some cases. We identify the optimal sketching sizes that minimize the out-of-sample prediction risks, and find that the optimally sketched estimator has stabler risk curves that eliminates the peaks of those for the full-sample estimator. We then propose a practical procedure to empirically identify the optimal sketching size. Finally, we extend our results to cover central limit theorems and misspecified models. Numerical studies strongly support our theory.
翻译:超度偏差通常有助于改善一般化的性能 。 本文建议了一种双向的超均度观点, 表明下游抽样也可能有助于概括化。 受这种双向观点的驱动, 我们定性了比例制中绘制的无脊脊椎最小平方估计值的两种超模的预测风险 $m\ asymp n\ asymp p$, 其中美元为草图大小, 美元为样本大小, 美元为特征维度 。 我们的结果揭示了下游的统计作用 。 具体地说, 下游抽样并不总能伤害一般化的性能, 在某些情况下可能实际上可以帮助改进它。 我们确定了最佳的草图规模, 最大限度地减少外观预测风险, 并发现最佳的草图估计值具有稳定的风险曲线, 消除全映射估计值的峰值。 我们然后提出一个实用的程序, 以实验方式确定最佳的草图大小 。 最后, 我们扩展了我们的结果, 以涵盖中心限值理论和误判模型 。