Communication is supposed to improve multi-agent collaboration and overall performance in cooperative Multi-agent reinforcement learning (MARL). However, such improvements are prevalently limited in practice since most existing communication schemes ignore communication overheads (e.g., communication delays). In this paper, we demonstrate that ignoring communication delays has detrimental effects on collaborations, especially in delay-sensitive tasks such as autonomous driving. To mitigate this impact, we design a delay-aware multi-agent communication model (DACOM) to adapt communication to delays. Specifically, DACOM introduces a component, TimeNet, that is responsible for adjusting the waiting time of an agent to receive messages from other agents such that the uncertainty associated with delay can be addressed. Our experiments reveal that DACOM has a non-negligible performance improvement over other mechanisms by making a better trade-off between the benefits of communication and the costs of waiting for messages.
翻译:通信本应改善多剂强化合作学习(MARL)方面的多剂合作和总体业绩,但是,这种改进在实践中普遍有限,因为大多数现有通信计划忽视通信间接费用(例如通信延误),在本文件中,我们证明,忽视通信延误对协作有不利影响,特别是在诸如自主驾驶等延迟敏感任务方面。为了减轻这种影响,我们设计了一个延迟认识多剂通信模式(DaCOM),以适应通信延误。具体地说,DaCOM引入了一个组件,即TimeNet,负责调整代理接收其他代理信息的时间,从而调整接收代理信息的时间,从而解决与延迟相关的不确定性。我们的实验表明,DaCOM通过在通信的好处与等待信息的成本之间作出更好的权衡,与其他机制相比,业绩的改善是不可忽略的。