We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surprisingly successful optimization performance despite the non-convexity of the problem, understanding what features are learned, why deep architectures perform exceptionally well in physical problems, and which fine aspects of an architecture affect the behavior of a learning task in which way. We present an overview of modern approaches that yield partial answers to these questions. For selected approaches, we describe the main ideas in more detail.
翻译:我们描述深层学习的数学分析新领域。这个领域围绕一系列研究问题出现,这些问题没有在传统的学习理论框架内得到回答。 这些问题涉及:过度平衡的神经网络的杰出普遍化力量、深度在深层建筑中的作用、明显缺乏多元性的诅咒、尽管问题不十分复杂却令人惊讶地成功地优化了绩效、了解了哪些特征、为什么深层建筑在物理问题中表现得特别好,以及建筑的精细方面如何影响学习任务的行为。我们概述了能够部分回答这些问题的现代方法。对于选定的方法,我们更详细地描述主要的想法。