We propose an evolution strategies-based algorithm for estimating gradients in unrolled computation graphs, called ES-Single. Similarly to the recently-proposed Persistent Evolution Strategies (PES), ES-Single is unbiased, and overcomes chaos arising from recursive function applications by smoothing the meta-loss landscape. ES-Single samples a single perturbation per particle, that is kept fixed over the course of an inner problem (e.g., perturbations are not re-sampled for each partial unroll). Compared to PES, ES-Single is simpler to implement and has lower variance: the variance of ES-Single is constant with respect to the number of truncated unrolls, removing a key barrier in applying ES to long inner problems using short truncations. We show that ES-Single is unbiased for quadratic inner problems, and demonstrate empirically that its variance can be substantially lower than that of PES. ES-Single consistently outperforms PES on a variety of tasks, including a synthetic benchmark task, hyperparameter optimization, training recurrent neural networks, and training learned optimizers.
翻译:我们提出了一种基于进化策略的算法,用于估计展开的计算图中的梯度,称为ES-Single。与最近提出的持续进化策略(PES)类似,ES-Single是无偏的,并通过平滑元丢失的景观来克服递归函数应用中出现的混沌。ES-Single对每个粒子采样一个扰动,并在内部问题的过程中保持不变(例如,不为每个部分展开重新采样扰动)。与PES相比,ES-Single更容易实现,并且具有较低的方差:ES-Single的方差相对于截断展开的数量是固定的,消除了在使用短截断展开长内部问题时应用ES的主要障碍。我们表明对于二次内部问题,ES-Single是无偏的,并实验证明其方差可以显着低于PES。ES-Single在各种任务中始终优于PES,包括合成基准任务,超参数优化,训练递归神经网络和训练学习优化器。