We develop a new formulation of deep learning based on the Mori-Zwanzig (MZ) formalism of irreversible statistical mechanics. The new formulation is built upon the well-known duality between deep neural networks and discrete dynamical systems, and it allows us to directly propagate quantities of interest (conditional expectations and probability density functions) forward and backward through the network by means of exact linear operator equations. Such new equations can be used as a starting point to develop new effective parameterizations of deep neural networks, and provide a new framework to study deep-learning via operator theoretic methods. The proposed MZ formulation of deep learning naturally introduces a new concept, i.e., the memory of the neural network, which plays a fundamental role in low-dimensional modeling and parameterization. By using the theory of contraction mappings, we develop sufficient conditions for the memory of the neural network to decay with the number of layers. This allows us to rigorously transform deep networks into shallow ones, e.g., by reducing the number of neurons per layer (using projection operators), or by reducing the total number of layers (using the decay property of the memory operator).
翻译:我们根据Mori-Zwanzig(MZ)的不可逆统计力正规化发展了一种深层次的深层次学习的新公式。新公式建立在深神经网络和离散动态系统之间众所周知的双重性的基础上,使我们能够通过精确线形操作方程式,通过网络直接传播前向和后向的利益(有条件期望和概率密度功能)量(有条件期望和概率密度功能),这种新公式可以作为起点,为深神经网络开发新的有效参数,并提供一个通过操作者理论方法进行深层研究的新框架。拟议的深层学习的MZ公式自然引入了一个新概念,即神经网络的记忆,在低维度建模和参数化中起着根本作用。我们利用收缩图学理论,为神经网络的记忆与层数层衰减创造了充分的条件。这使我们能够严格地将深层网络转换为浅层网络,例如通过减少每个层神经元的数量(使用投影操作者),或通过减少层层的总数(使用记忆的衰减)。