利用时间窗口为合作任务提供服务分布式规划:游戏理论方法 (Distributed Planning for Serving Cooperative Tasks with Time Windows: A Game Theoretic Approach)

We study distributed planning for multi-robot systems to provide optimal service to cooperative tasks that are distributed over space and time. Each task requires service by sufficiently many robots at the specified location within the specified time window. Tasks arrive over episodes and the robots try to maximize the total value of service in each episode by planning their own trajectories based on the specifications of incoming tasks. Robots are required to start and end each episode at their assigned stations in the environment. We present a game theoretic solution to this problem by mapping it to a game, where the action of each robot is its trajectory in an episode, and using a suitable learning algorithm to obtain optimal joint plans in a distributed manner. We present a systematic way to design minimal action sets (subsets of feasible trajectories) for robots based on the specifications of incoming tasks to facilitate fast learning. We then provide the performance guarantees for the cases where all the robots follow a best response or noisy best response algorithm to iteratively plan their trajectories. While the best response algorithm leads to a Nash equilibrium, the noisy best response algorithm leads to globally optimal joint plans with high probability. We show that the proposed game can in general have arbitrarily poor Nash equilibria, which makes the noisy best response algorithm preferable unless the task specifications are known to have some special structure. We also describe a family of special cases where all the equilibria are guaranteed to have bounded suboptimality. Simulations and experimental results are provided to demonstrate the proposed approach.

翻译：我们研究多机器人系统的分布规划,以便为空间和时间分布的合作任务提供最佳服务。每项任务都需要在指定的时间窗口内指定地点由足够多的机器人在指定的时间窗口内提供服务。任务会经过场面, 机器人试图根据即将到来的任务的规格, 最大限度地增加每集服务的总价值。机器人需要开始和结束在指定环境的站点的每集事件。我们通过将每个机器人的动作映射到一个游戏, 将它映射到一个游戏, 每个机器人的动作是它的一个轨迹, 并且使用一个适当的学习算法, 以分布的方式获得最佳的联合计划。我们提出一个系统的方法, 设计一个最小的动作组( 可行轨迹的子), 以便根据即将到来的任务的规格, 最大限度地增加服务。然后, 我们为所有机器人对迭接性计划的最佳反应算法提供性保证。最佳的算法可以导致纳什平衡, 最冷的最佳反应算法导致全球最佳的联合计划, 除非有很高的机率。我们展示了某些特殊的特别联合计划, 高的机率性标准。我们提出, 也展示了最有预的算。我们提议的机性地解释了。我们提出一个最优的算。

相关内容

CASES

关注 4

CASES：International Conference on Compilers, Architectures, and Synthesis for Embedded Systems。 Explanation：嵌入式系统编译器、体系结构和综合国际会议。 Publisher：ACM。 SIT： http://dblp.uni-trier.de/db/conf/cases/index.html

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

54+阅读 · 2020年9月7日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日