We investigate the analogy between the renormalization group (RG) and deep neural networks, wherein subsequent layers of neurons are analogous to successive steps along the RG. In particular, we quantify the flow of information by explicitly computing the relative entropy or Kullback-Leibler divergence in both the one- and two-dimensional Ising models under decimation RG, as well as in a feedforward neural network as a function of depth. We observe qualitatively identical behavior characterized by the monotonic increase to a parameter-dependent asymptotic value. On the quantum field theory side, the monotonic increase confirms the connection between the relative entropy and the c-theorem. For the neural networks, the asymptotic behavior may have implications for various information maximization methods in machine learning, as well as for disentangling compactness and generalizability. Furthermore, while both the two-dimensional Ising model and the random neural networks we consider exhibit non-trivial critical points, the relative entropy appears insensitive to the phase structure of either system. In this sense, more refined probes are required in order to fully elucidate the flow of information in these models.
翻译:我们调查重整组(RG)和深神经网络之间的类比,其中随后的神经元层与RG的相继步骤相类似。特别是,我们通过明确计算一维和二维Ising模型的一维和二维Ising模型中相对的 entropy 或 Kullback-Leiber 差异,以及作为深度函数的进化前神经网络中的差异,来量化信息流动。我们观察到以单体增长为特征的质等同行为到一个依赖参数的随机神经网络值。在量子字段理论方面,单体增长证实了相对的entropy和c-theorem之间的联系。对于神经网络来说,无源行为可能会对机器学习中的各种信息最大化方法产生影响,并且对脱钩紧性和普遍性产生影响。此外,虽然我们认为二维Ising模型和随机神经网络都显示了非三维临界点,但相对的昆虫体似乎对两种系统的阶段结构都不敏感。在这种意义上,需要更精细的探测器来充分解释这些模型的流动。