This paper considers the problem of learning a single ReLU neuron with squared loss (a.k.a., ReLU regression) in the overparameterized regime, where the input dimension can exceed the number of samples. We analyze a Perceptron-type algorithm called GLM-tron (Kakade et al., 2011), and provide its dimension-free risk upper bounds for high-dimensional ReLU regression in both well-specified and misspecified settings. Our risk bounds recover several existing results as special cases. Moreover, in the well-specified setting, we also provide an instance-wise matching risk lower bound for GLM-tron. Our upper and lower risk bounds provide a sharp characterization of the high-dimensional ReLU regression problems that can be learned via GLM-tron. On the other hand, we provide some negative results for stochastic gradient descent (SGD) for ReLU regression with symmetric Bernoulli data: if the model is well-specified, the excess risk of SGD is provably no better than that of GLM-tron ignoring constant factors, for each problem instance; and in the noiseless case, GLM-tron can achieve a small risk while SGD unavoidably suffers from a constant risk in expectation. These results together suggest that GLM-tron might be preferable than SGD for high-dimensional ReLU regression.
翻译:本文思考了在超度参数化系统中学习单一 ReLU 神经神经元(a.k.a.a.a., reLU 回归)的问题,在这种系统里,输入维度可能超过样本数量。我们分析了一种叫GLM-tron(Kakade等人,2011年)的 Percepron 类型算法,并提供了高度ReLU回归的无维风险上限。我们的风险界限作为特殊案例恢复了一些现有结果。此外,在设计完善的环境下,我们还为GLM-tron提供了一种以实例比GLM-tron低约束的风险更低的范围。我们的上下风险界限对高度ReLU回归问题作了清晰的描述,可以通过GLM-tron(KM-Tron,2011年)来学习。另一方面,我们为高度ReLU回归提供了一些无维度梯度下降(SGD)的负值上限。如果模型能够很好地定义,那么SGDD的超度风险不会比GL-M-crent Rest Rest Rest Rest Rest Reck 的G-reving as ex,而每个案例都可以避免GM-reving-reving-reving GGDBDBDDDDDD, 。</s>