This paper presents a technique to train a robot to perform kick-motion in AI soccer by using reinforcement learning (RL). In RL, an agent interacts with an environment and learns to choose an action in a state at each step. When training RL algorithms, a problem called the curse of dimensionality (COD) can occur if the dimension of the state is high and the number of training data is low. The COD often causes degraded performance of RL models. In the situation of the robot kicking the ball, as the ball approaches the robot, the robot chooses the action based on the information obtained from the soccer field. In order not to suffer COD, the training data, which are experiences in the case of RL, should be collected evenly from all areas of the soccer field over (theoretically infinite) time. In this paper, we attempt to use the relative coordinate system (RCS) as the state for training kick-motion of robot agent, instead of using the absolute coordinate system (ACS). Using the RCS eliminates the necessity for the agent to know all the (state) information of entire soccer field and reduces the dimension of the state that the agent needs to know to perform kick-motion, and consequently alleviates COD. The training based on the RCS is performed with the widely used Deep Q-network (DQN) and tested in the AI Soccer environment implemented with Webots simulation software.
翻译:本文介绍了通过强化学习( RL) 训练机器人在AI 足球中进行踢踢运动的技巧。 在 RL 中, 代理人与环境互动, 学会在每步每步都选择一个状态的行动。 当培训 RL 算法时, 如果状态的尺寸高, 培训数据少, 一个叫做维度诅咒( COD) 的问题可能发生。 COD 往往导致 RL 模型的性能退化。 在机器人踢球时, 球接近机器人, 机器人根据从足球场获得的信息选择行动。 为了不遭受 COD, 培训数据, 即RL 的经验, 应该从足球场的所有地区( 时间无限) 平等收集。 在本文中, 我们试图使用相对协调系统( RCS ) 来训练机器人代理人的动作, 而不是使用绝对协调系统( ACS ) 来消除代理人了解从足球场获得的所有信息的必要性。 为了不遭受 COD 的深度信息, 培训数据应该从足球场( COQ ) 上广泛进行, 测试, 并降低 READ 动作 的力度。