We prove that natural gradient descent, with respect to the parameters of a machine learning policy, admits a conjugate dynamical description consistent with evolution by natural selection. We characterize these conjugate dynamics as a locally optimal fit to the continuous-time replicator dynamics, and show that the Price equation applies to equivalence classes of functions belonging to a Hilbert space generated by the policy's architecture and parameters. We posit that "conjugate natural selection" intuitively explains the empirical effectiveness of natural gradient descent, while developing a useful analytic approach to the dynamics of machine learning.
翻译:我们证明,在机器学习政策的参数方面,自然梯度的下降承认了一种与自然选择的演进相一致的动态描述。 我们把这些共振动态描述描述为一种适合连续时间复制动态的当地最佳条件,并表明价格方程式适用于该政策架构和参数产生的属于Hilbert空间的等同功能类别。 我们假设“协调自然选择”能直观地解释自然梯度下降的经验效果,同时对机器学习动态制定有用的分析方法。