We analyze Elman-type Recurrent Reural Networks (RNNs) and their training in the mean-field regime. Specifically, we show convergence of gradient descent training dynamics of the RNN to the corresponding mean-field formulation in the large width limit. We also show that the fixed points of the limiting infinite-width dynamics are globally optimal, under some assumptions on the initialization of the weights. Our results establish optimality for feature-learning with wide RNNs in the mean-field regime
翻译:我们分析了Elman型的常务革命网络及其在中场制度下的培训。具体地说,我们显示了净网络的梯度下层培训动态与宽度大范围内相应的平均场配方的趋同性。我们还表明,根据一些关于权重初始化的假设,限制无限宽度动态的固定点是全球最佳的。我们的结果为与中场制度中的广域网进行特色学习确立了最佳的优势。</s>