Only limited studies and superficial evaluations are available on agents' behaviors and roles within a Multi-Agent System (MAS). We simulate a MAS using Reinforcement Learning (RL) in a pursuit-evasion (a.k.a predator-prey pursuit) game, which shares task goals with target acquisition, and we create different adversarial scenarios by replacing RL-trained pursuers' policies with two distinct (non-RL) analytical strategies. Using heatmaps of agents' positions (state-space variable) over time, we are able to categorize an RL-trained evader's behaviors. The novelty of our approach entails the creation of an influential feature set that reveals underlying data regularities, which allow us to classify an agent's behavior. This classification may aid in catching the (enemy) targets by enabling us to identify and predict their behaviors, and when extended to pursuers, this approach towards identifying teammates' behavior may allow agents to coordinate more effectively.
翻译:在多机构系统(MAS)内,对代理人的行为和作用只有有限的研究和表面评价。我们在追逐-规避(a.k.a.k.a 掠夺性猎物-猎物追逐)游戏中,用强化学习(RL)模拟MAS(RL),将任务目标与获取目标相提并论,我们创造不同的对抗情景,用两种不同的(非RL)分析策略取代劳累训练的追逐者的政策。在一段时间内,我们使用代理人职位的热图(州-空域变量),能够对经RL训练的逃避者的行为进行分类。我们的方法的新颖之处是创建一套具有影响力的功能,揭示出潜在的数据规律性,从而使我们能够对代理人的行为进行分类。这种分类有助于通过使我们能够识别和预测其行为来达到(敌性)目标,在推广到追逐者时,这种确定队友行为的方法可以使代理人更有效地协调。