Robotics research has been focusing on cooperative multi-agent problems, where agents must work together and communicate to achieve a shared objective. To tackle this challenge, we explore imitation learning algorithms. These methods learn a controller by observing demonstrations of an expert, such as the behaviour of a centralised omniscient controller, which can perceive the entire environment, including the state and observations of all agents. Performing tasks with complete knowledge of the state of a system is relatively easy, but centralised solutions might not be feasible in real scenarios since agents do not have direct access to the state but only to their observations. To overcome this issue, we train end-to-end Neural Networks that take as input local observations obtained from an omniscient centralised controller, i.e., the agents' sensor readings and the communications received, producing as output the action to be performed and the communication to be transmitted. This study concentrates on two cooperative tasks using a distributed controller: distributing the robots evenly in space and colouring them based on their position relative to others. While an explicit exchange of messages between the agents is required to solve the second task, in the first one, a communication protocol is unnecessary, although it may increase performance. The experiments are run in Enki, a high-performance open-source simulator for planar robots, which provides collision detection and limited physics support for robots evolving on a flat surface. Moreover, it can simulate groups of robots hundreds of times faster than real-time. The results show how applying a communication strategy improves the performance of the distributed model, letting it decide which actions to take almost as precisely and quickly as the expert controller.
翻译:机器人研究一直侧重于合作性多试剂问题, 代理商必须合作并交流, 以实现共同的目标。 为了应对这一挑战, 我们探索模仿学习算法。 这些方法通过观察专家的演示来学习控制器, 比如一个集中的无意识控制器的行为, 它可以感知整个环境, 包括所有代理商的状态和观察。 完全了解系统状态的任务相对容易完成, 但是在现实情况下集中化解决方案可能不可行, 因为代理商不能直接接触州, 只能通过他们的意见来进行交流。 为了克服这个问题, 我们训练端到端的神经网络, 将从无意识中央控制器( 即, 代理器传感器的阅读和收到的通信作为输出, 包括所有代理商的状态和观察。 这项研究侧重于使用分布式控制器执行两项合作任务: 将机器人在空间中平均分配, 并且根据他们与其他人的相对位置进行彩色。 虽然代理商之间需要进行明确的通信交流, 要解决第二个任务, 端端到端端端端端端端端端端的网络网络, 将快速的当地观测结果作为高级的实验, 高级的操作是不必要操作程序, 它可以让机器人的操作更精确的轨道, 。 操作可以让机器人的操作更精确的操作更精确的操作更精确的操作, 。</s>