In the past decade the mathematical theory of machine learning has lagged far behind the triumphs of deep neural networks on practical challenges. However, the gap between theory and practice is gradually starting to close. In this paper I will attempt to assemble some pieces of the remarkable and still incomplete mathematical mosaic emerging from the efforts to understand the foundations of deep learning. The two key themes will be interpolation, and its sibling, over-parameterization. Interpolation corresponds to fitting data, even noisy data, exactly. Over-parameterization enables interpolation and provides flexibility to select a right interpolating model. As we will see, just as a physical prism separates colors mixed within a ray of light, the figurative prism of interpolation helps to disentangle generalization and optimization properties within the complex picture of modern Machine Learning. This article is written with belief and hope that clearer understanding of these issues brings us a step closer toward a general theory of deep learning and machine learning.
翻译:过去十年来,机器学习的数学理论远远落后于深神经网络在实际挑战方面的胜利。然而,理论和实践之间的差距正逐渐开始缩小。在本文件中,我将试图汇集一些从深层次学习基础的努力中产生的非凡和仍然不完整的数学拼图。这两个关键主题将是内插,以及其相似的、超分度。内插与数据相匹配,甚至数据也十分吵闹。过分的分化可以促进内插,并为选择正确的内插模型提供了灵活性。正如我们可以看到的,正如物理棱镜将光线中的颜色混杂在一起一样,图象的内插理论有助于在现代机器学习的复杂图景中解开一般化和优化特性。撰写这篇文章时相信并希望对这些问题的更清楚理解能使我们更接近于深层次学习和机器学习的一般理论。