In this paper, we consider the problem of allocating human operators in a system with multiple semi-autonomous robots. Each robot is required to perform an independent sequence of tasks, subjected to a chance of failing and getting stuck in a fault state at every task. If and when required, a human operator can assist or teleoperate a robot. Conventional MDP techniques used to solve such problems face scalability issues due to exponential growth of state and action spaces with the number of robots and operators. In this paper we derive conditions under which the operator allocation problem is indexable, enabling the use of the Whittle index heuristic. The conditions can be easily checked to verify indexability, and we show that they hold for a wide range of problems of interest. Our key insight is to leverage the structure of the value function of individual robots, resulting in conditions that can be verified separately for each state of each robot. We apply these conditions to two types of transitions commonly seen in remote robot supervision systems. Through numerical simulations, we demonstrate the efficacy of Whittle index policy as a near-optimal and scalable approach that outperforms existing scalable methods.
翻译:在本文中,我们考虑在多半自主机器人的系统中分配人类操作员的问题。 每个机器人都必须执行独立的任务序列, 且在每项任务中都有失败和陷入故障状态的机会。 如果需要, 人类操作员可以协助或操作机器人。 用于解决这些问题的常规 MDP 技术面临可缩放问题, 原因是国家和行动空间的指数增长以及机器人和操作员的数量。 在本文件中, 我们得出操作员分配问题可以索引化的条件, 从而能够使用Whittle指数超常。 条件可以很容易地检查, 以校验指数性, 并显示他们持有广泛的问题。 我们的关键洞察力是利用单个机器人的值功能结构, 从而导致每个机器人的每个状态可以分别核实。 我们将这些条件应用于在远程机器人监督系统中常见的两种类型的过渡。 通过数字模拟, 我们演示Whittt 指数政策作为接近优化和可缩放的方法的功效, 超越了现有的可缩放方法。