We consider a task-effective quantization problem that arises when multiple agents are controlled via a centralized controller (CC). While agents have to communicate their observations to the CC for decision-making, the bit-budgeted communications of agent-CC links may limit the task-effectiveness of the system which is measured by the system's average sum of stage costs/rewards. As a result, each agent should compress/quantize its observation such that the average sum of stage costs/rewards of the control task is minimally impacted. We address the problem of maximizing the average sum of stage rewards by proposing two different Action-Based State Aggregation (ABSA) algorithms that carry out the indirect and joint design of control and communication policies in the multi-agent system. While the applicability of ABSA-1 is limited to single-agent systems, it provides an analytical framework that acts as a stepping stone to the design of ABSA-2. ABSA-2 carries out the joint design of control and communication for a multi-agent system. We evaluate the algorithms - with average return as the performance metric - using numerical experiments performed to solve a multi-agent geometric consensus problem. The numerical results are concluded by introducing a new metric that measures the effectiveness of communications in a multi-agent system.
翻译:我们认为,如果多个代理人通过中央控制器(CC)控制,就会出现任务效率高的量化问题。代理商必须将其观测结果通报给CC以便决策,而代理商-CC联系的比特预算通信可能会限制该系统的任务效力,而该系统以系统平均阶段成本/报酬总额衡量,因此,每个代理商应压缩/量化其观测结果,使控制任务的平均阶段成本/报酬总和受到最小影响。我们通过提出两种不同的基于行动的国家聚合(ABSA)算法,以间接和联合设计多试剂系统中的控制和通信政策,解决了阶段平均报酬和平均报酬之和最大化的问题。虽然ABSA-1的适用性限于单一代理系统,但它提供了一个分析框架,作为设计ABSA-2的垫脚石。ABSA-2为多试剂系统联合设计控制与通信的方法。我们用两种基于行动的国家聚合(ABSA)算法来评价平均回报率的问题。我们用数字衡量法评估了在多试算方法中进行的数值实验,目的是用数字性实验来解决多试剂系统中的新数字性结果。