We develop a new formulation of deep learning based on the Mori-Zwanzig (MZ) formalism of irreversible statistical mechanics. The new formulation is built upon the well-known duality between deep neural networks and discrete stochastic dynamical systems, and it allows us to directly propagate quantities of interest (conditional expectations and probability density functions) forward and backward through the network by means of exact linear operator equations. Such new equations can be used as a starting point to develop new effective parameterizations of deep neural networks, and provide a new framework to study deep-learning via operator theoretic methods. The proposed MZ formulation of deep learning naturally introduces a new concept, i.e., the memory of the neural network, which plays a fundamental role in low-dimensional modeling and parameterization. By using the theory of contraction mappings, we develop sufficient conditions for the memory of the neural network to decay with the number of layers. This allows us to rigorously transform deep networks into shallow ones, e.g., by reducing the number of neurons per layer (using projection operators), or by reducing the total number of layers (using the decay property of the memory operator).
翻译:我们根据不可逆统计力的Mori-Zwanzig(MZ)形式发展了一种新的深层次学习模式。新模式建立在深神经网络和离散随机动态系统之间众所周知的双重性的基础上,使我们能够通过精确线性运算方程式,通过网络直接传播前向和后向的利益(有条件期望和概率密度功能)量(有条件期望和概率密度功能)。这种新方程式可以用作起点,以发展深神经网络的新的有效参数化,并为通过操作者理论方法进行深层研究提供新的框架。拟议的深层学习的MZ公式自然引入了一个新概念,即神经网络的记忆,在低维度建模和参数化方面起着根本作用。我们利用收缩图学理论,为神经网络的记忆与层数相腐蚀创造了充分的条件。这使我们能够严格地将深层网络转换为浅层网络,例如通过减少每层神经元的数量(使用投影操作者),或者通过减少层的总数(运行者记忆的衰损特性),从而得以将深度网络转变为浅层。