Neuroevolution has recently been shown to be quite competitive in reinforcement learning (RL) settings, and is able to alleviate some of the drawbacks of gradient-based approaches. This paper will focus on applying neuroevolution using a simple genetic algorithm (GA) to find the weights of a neural network that produce optimally behaving agents. In addition, we present two novel modifications that improve the data efficiency and speed of convergence when compared to the initial implementation. The modifications are evaluated on the FrozenLake environment provided by OpenAI gym and prove to be significantly better than the baseline approach.
翻译:最近,神经革命在强化学习(RL)设置方面表现出相当的竞争力,能够减轻梯度方法的一些缺点,本文件将侧重于应用神经革命,采用简单的遗传算法(GA)来寻找最能产生最佳行为剂的神经网络的重量。此外,我们提出了两项新的修改,提高了数据效率和与初始实施相比的趋同速度。对OpenAI健身房提供的FrozenLake环境进行了评估,并证明这些修改大大优于基线方法。