利用非计量行为学习促进在机会制约下作出多机构决策:数据驱动的预测控制框架 (Non-Parametric Behavior Learning for Multi-Agent Decision Making With Chance Constraints: A Data-Driven Predictive Control Framework)

In many specific scenarios, accurate and effective system identification is a commonly encountered challenge in the model predictive control (MPC) formulation. As a consequence, the overall system performance could be significantly degraded in outcome when the traditional MPC algorithm is adopted under those circumstances when such accuracy is lacking. To cater to this rather major shortcoming, this paper investigates a non-parametric behavior learning method for multi-agent decision making, which underpins an alternate data-driven predictive control framework. Utilizing an innovative methodology with closed-loop input/output measurements of the unknown system, the behavior of the system is learned based on the collected dataset, and thus the constructed non-parametric predictive model can be used for the determination of optimal control actions. This non-parametric predictive control framework attains the noteworthy key advantage of alleviating the heavy computational burden commonly encountered in the optimization procedures otherwise involved. Such requisite optimization procedures are typical in existing methodologies requiring open-loop input/output measurement data collection and parametric system identification. Then with a conservative approximation of probabilistic chance constraints for the MPC problem, a resulting deterministic optimization problem is formulated and solved effectively. This intuitive data-driven approach is also shown to preserve good robustness properties (even in the inevitable existence of parametric uncertainties that naturally arise in the typical system identification process). Finally, a multi-drone system is used to demonstrate the practical appeal and highly effective outcome of this promising development.

翻译：在许多具体假设中,准确和有效的系统识别是模型预测控制(MPC)制定过程中常见的一个常见挑战,因此,如果在缺乏这种准确性的情况下采用传统的MPC算算法,整个系统绩效可能会在结果上显著下降;为了应对这一相当重大的缺陷,本文件调查了多种代理决策的非参数行为学习方法,这是数据驱动预测控制框架的替代基础;利用对未知系统进行闭路输入/输出测量的创新方法,系统的行为是根据所收集的数据集学习的,因此,在缺乏这种准确性的情况下,建立的非参数性预测模型可用于确定最佳控制行动;这一非参数预测性控制框架具有显著的关键优势,即减轻在其他方面优化程序通常遇到的沉重计算负担;这种必要的优化程序典型于现有的方法,即要求开放性投入/产出计量数据收集和对准性系统进行识别;随后,根据所收集的数据数据集的稳妥性近近近性近性近似于预测性的可能性限制,从而导致的确定性精确性预测性预测性预测性预测性预测性模型可用于确定最佳的控制行动;这一非参数的预测性预测性预测性预测性预测性预测性预测性预测性控制框架,同时也在自然地运用了一种稳定性结果的确定性结果的确定性结果。