How neural network behaves during the training over different choices of hyperparameters is an important question in the study of neural networks. In this work, inspired by the phase diagram in statistical mechanics, we draw the phase diagram for the two-layer ReLU neural network at the infinite-width limit for a complete characterization of its dynamical regimes and their dependence on hyperparameters related to initialization. Through both experimental and theoretical approaches, we identify three regimes in the phase diagram, i.e., linear regime, critical regime and condensed regime, based on the relative change of input weights as the width approaches infinity, which tends to $0$, $O(1)$ and $+\infty$, respectively. In the linear regime, NN training dynamics is approximately linear similar to a random feature model with an exponential loss decay. In the condensed regime, we demonstrate through experiments that active neurons are condensed at several discrete orientations. The critical regime serves as the boundary between above two regimes, which exhibits an intermediate nonlinear behavior with the mean-field model as a typical example. Overall, our phase diagram for the two-layer ReLU NN serves as a map for the future studies and is a first step towards a more systematical investigation of the training behavior and the implicit regularization of NNs of different structures.
翻译:在对神经网络的不同选择进行培训的过程中,神经网络在超光谱度的不同选择上的行为方式是神经网络研究中的一个重要问题。在这项工作中,我们根据统计力的阶段图,在统计力的阶段图的启发下,为两层ReLU神经网络在无限宽度范围内绘制阶段图,以完整地描述其动态系统及其在初始化方面对超光度仪的依赖。通过实验和理论方法,我们在阶段图中确定了三个制度,即线性体系、临界体系和压缩体系,其依据是输入权重相对变化,作为宽度的宽度方法,分别趋向于0美元、1美元和1美元。在线性体系中,NNN培训动态与指数性损失衰减的随机特征模型大致相似。在压缩体系中,我们通过实验证明活跃神经元以若干离散的方向被压缩。关键体系是两个体系之间的边界,它们展示出一种中间的非线性行为,以中值模型作为典型的示例,分别是0美元、1美元和1美元。在线性体系化制度中,一个阶段图是未来进行系统化的系统化的系统化的状态图。总体,这是对NNUR的系统化结构进行系统化的系统化结构的图。