This paper provides theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, responding to an open question in the literature. We also discuss approaches to provide non-vacuous generalization guarantees for deep learning. Based on theoretical observations, we propose new open problems and discuss the limitations of our results.
翻译:本文从理论上深入了解了为什么以及深层次的学习能够很好地概括,尽管其能力巨大、复杂、可能算法不稳定、非野蛮和尖锐的迷你,对文献中的一个未决问题作出了回应。我们还讨论了为深层学习提供非空泛的普及保障的方法。根据理论观察,我们提出了新的开放问题,并讨论了我们结果的局限性。