Deep learning is the mainstream technique for many machine learning tasks, including image recognition, machine translation, speech recognition, and so on. It has outperformed conventional methods in various fields and achieved great successes. Unfortunately, the understanding on how it works remains unclear. It has the central importance to lay down the theoretic foundation for deep learning. In this work, we give a geometric view to understand deep learning: we show that the fundamental principle attributing to the success is the manifold structure in data, namely natural high dimensional data concentrates close to a low-dimensional manifold, deep learning learns the manifold and the probability distribution on it. We further introduce the concepts of rectified linear complexity for deep neural network measuring its learning capability, rectified linear complexity of an embedding manifold describing the difficulty to be learned. Then we show for any deep neural network with fixed architecture, there exists a manifold that cannot be learned by the network. Finally, we propose to apply optimal mass transportation theory to control the probability distribution in the latent space.
翻译:深层学习是许多机器学习任务的主流技术, 包括图像识别、 机器翻译、 语音识别等等。 它在各个领域已经超过常规方法, 并取得了巨大成功 。 不幸的是, 对它如何运作的理解仍然不清楚 。 它具有为深层学习奠定理论基础的核心重要性 。 在这项工作中, 我们给出一个几何视角来理解深层学习 : 我们显示, 成功背后的基本原则是数据中的多层结构, 即自然高维数据集中接近一个低维的方块, 深层学习会了解它的多层和概率分布 。 我们进一步引入了深神经网络测量其学习能力的纠正线性复杂度概念, 纠正描述所要学习的困难的嵌入式的线性复杂度 。 然后我们展示任何具有固定结构的深层神经网络, 存在网络无法学习的多层。 最后, 我们提议应用最佳的大众传输理论来控制潜层空间的概率分布 。