The purpose of this tutorial is to help individuals use the \underline{FireCommander} game environment for research applications. The FireCommander is an interactive, probabilistic joint perception-action reconnaissance environment in which a composite team of agents (e.g., robots) cooperate to fight dynamic, propagating firespots (e.g., targets). In FireCommander game, a team of agents must be tasked to optimally deal with a wildfire situation in an environment with propagating fire areas and some facilities such as houses, hospitals, power stations, etc. The team of agents can accomplish their mission by first sensing (e.g., estimating fire states), communicating the sensed fire-information among each other and then taking action to put the firespots out based on the sensed information (e.g., dropping water on estimated fire locations). The FireCommander environment can be useful for research topics spanning a wide range of applications from Reinforcement Learning (RL) and Learning from Demonstration (LfD), to Coordination, Psychology, Human-Robot Interaction (HRI) and Teaming. There are four important facets of the FireCommander environment that overall, create a non-trivial game: (1) Complex Objectives: Multi-objective Stochastic Environment, (2)Probabilistic Environment: Agents' actions result in probabilistic performance, (3) Hidden Targets: Partially Observable Environment and, (4) Uni-task Robots: Perception-only and Action-only agents. The FireCommander environment is first-of-its-kind in terms of including Perception-only and Action-only agents for coordination. It is a general multi-purpose game that can be useful in a variety of combinatorial optimization problems and stochastic games, such as applications of Reinforcement Learning (RL), Learning from Demonstration (LfD) and Inverse RL (iRL).
翻译:此教程的目的是帮助个人使用 下线 { FireCommander} 游戏环境来进行研究。 FireCommander 是一个互动的、概率性的联合感知-行动侦察环境,在这个环境中,由各种代理人(如机器人)组成的混合团队合作打击动态,宣传火点(如目标) 。在FireCommander 游戏中,必须指派一组代理人,以最优化的方式处理一个环境中的野火状况,环境有传播火区和一些设施,如房屋、医院、电站等。 代理团队可以通过首先感测(例如,估算消防状态)来完成任务,相互交流感知性联合感知-感知-感知-行动,然后采取行动,根据感知性信息(如在估计的消防地点投放水)。 Firecomland 环境可以有助于研究主题,从Seriteal Learning(RL) 和从演示中学习(LfD)、 协调、心理学、 人类机器人互动(HRI) 和Tegorial 动作(包括运动) 的4个重要层面: 环境、 直观环境、 直观、直观、直观、 环境、直观、直观、直观、直观、直观、直观、直观、直观、环境、直观、直观、直观、直观、直观、直观、直观、直观、直观、环境、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、环境、直观、直观、直观、直观、直观、直观、直观、直观、环境、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直观、直路、直路、直路、直观、直观、直观、直观、直观、直观、直观、直路等等等