In this paper, we study the generalization performance of global minima for implementing empirical risk minimization (ERM) on over-parameterized deep ReLU nets. Using a novel deepening scheme for deep ReLU nets, we rigorously prove that there exist perfect global minima achieving almost optimal generalization error bounds for numerous types of data under mild conditions. Since over-parameterization is crucial to guarantee that the global minima of ERM on deep ReLU nets can be realized by the widely used stochastic gradient descent (SGD) algorithm, our results indeed fill a gap between optimization and generalization.
翻译:在本文中,我们研究了全球微型项目在对过度参数化的深ReLU网实施尽量减少风险的经验性经验性微粒的普及性业绩。我们利用一个全新的深RELU网深化计划,严格证明全球小型项目在温和条件下对多种类型的数据实现几乎最佳的普及性误差。 由于过度参数化对于保证机构风险管理在深RELU网上的全球微型项目能够通过广泛使用的随机梯度梯度梯度下行算法(SGD)实现至关重要,我们的结果确实填补了优化和概括化之间的空白。