多方机构系统非计量非神经元数据协调 (Non-Parametric Neuro-Adaptive Coordination of Multi-Agent Systems)

We develop a learning-based algorithm for the distributed formation control of networked multi-agent systems governed by unknown, nonlinear dynamics. Most existing algorithms either assume certain parametric forms for the unknown dynamic terms or resort to unnecessarily large control inputs in order to provide theoretical guarantees. The proposed algorithm avoids these drawbacks by integrating neural network-based learning with adaptive control in a two-step procedure. In the first step of the algorithm, each agent learns a controller, represented as a neural network, using training data that correspond to a collection of formation tasks and agent parameters. These parameters and tasks are derived by varying the nominal agent parameters and the formation specifications of the task in hand, respectively. In the second step of the algorithm, each agent incorporates the trained neural network into an online and adaptive control policy in such a way that the behavior of the multi-agent closed-loop system satisfies a user-defined formation task. Both the learning phase and the adaptive control policy are distributed, in the sense that each agent computes its own actions using only local information from its neighboring agents. The proposed algorithm does not use any a priori information on the agents' unknown dynamic terms or any approximation schemes. We provide formal theoretical guarantees on the achievement of the formation task.

翻译：我们开发了一种基于学习的算法,用于对由未知的非线性动态调节的网络多试剂系统进行分布式编组控制。大多数现有的算法要么对未知的动态术语采取某些参数形式,要么采用不必要地大量控制投入来提供理论保证。提议的算法将神经网络学习与适应性控制结合起来,从而避免了这些缺点。在算法的第一步,每个代理人学习一个控制器,作为神经网络,使用与编组任务和代理参数汇编相对应的培训数据。这些参数和任务分别来自名义代理参数和手头任务形成规格的不同。在算法的第二步,每个代理人将经过训练的神经网络纳入在线和适应性控制政策,使多代理人闭环系统的行为能够满足用户定义的编组任务。学习阶段和适应性控制政策都分布在不同的意义上,即每个代理人只使用其邻近代理人的当地信息来计算自己的行动。拟议的算法并不使用任何关于代理人正式动态任务组建的先期信息。我们不使用任何不为人所知的理论性任务计划。