We study the problem of minimizing the resource capacity of autonomous agents cooperating to achieve a shared task. More specifically, we consider high-level planning for a team of homogeneous agents that operate under resource constraints in stochastic environments and share a common goal: given a set of target locations, ensure that each location will be visited infinitely often by some agent almost surely. We formalize the dynamics of agents by consumption Markov decision processes. In a consumption Markov decision process, the agent has a resource of limited capacity. Each action of the agent may consume some amount of the resource. To avoid exhaustion, the agent can replenish its resource to full capacity in designated reload states. The resource capacity restricts the capabilities of the agent. The objective is to assign target locations to agents, and each agent is only responsible for visiting the assigned subset of target locations repeatedly. Moreover, the assignment must ensure that the agents can carry out their tasks with minimal resource capacity. We reduce the problem of finding target assignments for a team of agents with the lowest possible capacity to an equivalent graph-theoretical problem. We develop an algorithm that solves this graph problem in time that is \emph{polynomial} in the number of agents, target locations, and size of the consumption Markov decision process. We demonstrate the applicability and scalability of the algorithm in a scenario where hundreds of unmanned underwater vehicles monitor hundreds of locations in environments with stochastic ocean currents.
翻译:我们研究如何最大限度地减少自主代理商的资源能力,以便合作完成共同的任务。更具体地说,我们考虑为一个由单一代理商组成的团队进行高层次规划,该团队在随机环境的资源限制下运作,并有一个共同的目标:考虑到一组目标地点,确保每个地点将无限制地被某些代理商访问;我们通过消费Markov决策程序,正式确定代理商的动态;在消费的Markov决策程序中,该代理商拥有有限的能力;该代理商的每一项行动都可能消耗一定数量的资源;为了避免耗尽,该代理商可以将其资源补充到指定再加载状态的全部能力。该代理商的能力受到限制。目标是将目标地点指定给代理商,而每个代理商只负责多次访问指定的目标地点。此外,这一任务必须确保该代理商能够以最弱的资源能力执行任务。在消费能力最低的代理商团队中,我们减少目标任务分配的问题。我们开发一种算法,在指定再加装州将资源补充到全部能力。 资源能力限制该代理商的能力。目标是将目标地点指定给代理商,而每个代理商只负责多次访问指定特定目标地点。此外,我们在数百的海洋消费环境中,我们将显示可控路路方规模。