使用 NEAT 和强化学习无限期地玩二维游戏 (Playing a 2D Game Indefinitely using NEAT and Reinforcement Learning)

For over a decade now, robotics and the use of artificial agents have become a common thing.Testing the performance of new path finding or search space optimization algorithms has also become a challenge as they require simulation or an environment to test them.The creation of artificial environments with artificial agents is one of the methods employed to test such algorithms.Games have also become an environment to test them.The performance of the algorithms can be compared by using artificial agents that will behave according to the algorithm in the environment they are put in.The performance parameters can be, how quickly the agent is able to differentiate between rewarding actions and hostile actions.This can be tested by placing the agent in an environment with different types of hurdles and the goal of the agent is to reach the farthest by taking decisions on actions that will lead to avoiding all the obstacles.The environment chosen is a game called "Flappy Bird".The goal of the game is to make the bird fly through a set of pipes of random heights.The bird must go in between these pipes and must not hit the top, the bottom, or the pipes themselves.The actions that the bird can take are either to flap its wings or drop down with gravity.The algorithms that are enforced on the artificial agents are NeuroEvolution of Augmenting Topologies (NEAT) and Reinforcement Learning.The NEAT algorithm takes an "N" initial population of artificial agents.They follow genetic algorithms by considering an objective function, crossover, mutation, and augmenting topologies.Reinforcement learning, on the other hand, remembers the state, the action taken at that state, and the reward received for the action taken using a single agent and a Deep Q-learning Network.The performance of the NEAT algorithm improves as the initial population of the artificial agents is increased.

翻译：10多年来, 机器人和人工剂的使用已成为常见事物。测试新路径发现或搜索空间优化算法的性能也已成为一项挑战, 因为它们需要模拟或环境来测试它们。以人工剂创建人工环境是用来测试这些算法的方法之一。 Games 也已成为测试这些算法的环境之一。算法的性能可以通过使用根据在环境中的算法进行操作来比较。性能参数可以是, 代理人能够如何迅速区分得益的行动和敌对行动。可以通过将代理人置于不同类型障碍的环境中来测试。代理人的目标是通过做出能够避免所有障碍的行动来达到最远。所选择的环境是一个叫作“ Flappy Bird” 的游戏。游戏的目的是让鸟类通过随机高度的管道飞翔。鸟类必须进入这些管道, 并且不能在顶部、底部或管道本身。鸟的深度动作是“ 动性能动作, 动性能的动力动力动作是: 动动性动作, 动动性动作, 开始, 动动的动力动作, 开始, 开始, 开始, 动动动, 动作, 开始, 动动, 动动, 动动动动动, 开始, 动作, 动作, 动作, 开始, 动作, 动作, 动作, 动作, 动动动动动动, 动作, 动作, 开始, 开始, 开始, 动, 动, 动, 动, 动, 开始, 开始开始开始动, 动作, 动作, 动作, 动, 动, 动, 动作, 动作, 动作, 开始, 动作, 开始, 开始, 动作, 开始, 动作, 动作, 动作, 动作, 开始, 动作, 动作, 动作, 动作, 动作, 动作, 动作, 动作, 动作, 动作, 动作, 动作, 动作, 动作, 动作, 动作, 动作, 动作, 开始, 动作, 动作, 动作, 动作, 动作, 开始, 动作, 开始, 开始, 动作, 动作, 动作, 动作, 动作, 动作,