We show analytically that training a neural network by stochastic mutation or "neuroevolution" of its weights is equivalent, in the limit of small mutations, to gradient descent on the loss function in the presence of Gaussian white noise. Averaged over independent realizations of the learning process, neuroevolution is equivalent to gradient descent on the loss function. We use numerical simulation to show that this correspondence can be observed for finite mutations, for shallow and deep neural networks. Our results provide a connection between two distinct types of neural-network training, and provide justification for the empirical success of neuroevolution.
翻译:我们通过分析显示,通过随机突变或“神经进化”来训练神经网络,其重量的“神经进化”在小变异的限度内相当于在高西亚白人噪音面前丧失功能时的梯度下降。在独立认识到学习过程之后,神经进化平均相当于损失功能的梯度下降。我们用数字模拟来显示,对于有限的突变,对于浅层和深层神经网络,可以观察到这种通信。我们的结果为两种不同的神经网络培训提供了联系,并为神经进化的成功经验提供了理由。