具有学习能力的智商代理人追逐 -- -- 运动会动态视角 (A Dynamics Perspective of Pursuit-Evasion Games of Intelligent Agents with the Ability to Learn)

Pursuit-evasion games are ubiquitous in nature and in an artificial world. In nature, pursuer(s) and evader(s) are intelligent agents that can learn from experience, and dynamics (i.e., Newtonian or Lagrangian) is vital for the pursuer and the evader in some scenarios. To this end, this paper addresses the pursuit-evasion game of intelligent agents from the perspective of dynamics. A bio-inspired dynamics formulation of a pursuit-evasion game and baseline pursuit and evasion strategies are introduced at first. Then, reinforcement learning techniques are used to mimic the ability of intelligent agents to learn from experience. Based on the dynamics formulation and reinforcement learning techniques, the effects of improving both pursuit and evasion strategies based on experience on pursuit-evasion games are investigated at two levels 1) individual runs and 2) ranges of the parameters of pursuit-evasion games. Results of the investigation are consistent with nature observations and the natural law - survival of the fittest. More importantly, with respect to the result of a pursuit-evasion game of agents with baseline strategies, this study achieves a different result. It is shown that, in a pursuit-evasion game with a dynamics formulation, an evader is not able to escape from a slightly faster pursuer with an effective learned pursuit strategy, based on agile maneuvers and an effective learned evasion strategy.

翻译：在自然界,追寻者和逃避者都是能从经验中学习的智能分子,动态(即牛顿或拉格朗吉安)对追追寻者和逃避者至关重要。为此,本文件从动态角度论述智能者追寻和逃避游戏。首先引入了追寻和逃避策略的生物激励动态设计。然后,利用强化学习技巧模仿智能分子从经验中学习的能力。根据动态设计和强化学习技巧,根据追寻和逃避游戏经验改进追寻和逃避策略的效果,在追逐和逃避游戏参数范围分为两个层次(1)个别赛跑和2级。调查结果与自然观察和自然法相一致 — — 匹配者的生存。更重要的是,关于追寻和逃避基线战略代理人追寻和逃避策略的结果,本项研究在追寻和强化学习技巧的基础上取得了一种不同的结果。在追寻和逃避策略上,以学习得更快的追寻和追寻策略展示了一种不易越越越越越战略。在追寻中展示了一种稍快的越越越越越战略。