Quadratic programs (QPs) that enforce control barrier functions (CBFs) have become popular for safety-critical control synthesis, in part due to their ease of implementation and constraint specification. The construction of valid CBFs, however, is not straightforward, and for arbitrarily chosen parameters of the QP, the system trajectories may enter states at which the QP either eventually becomes infeasible, or may not achieve desired performance. In this work, we pose the control synthesis problem as a differential policy whose parameters are optimized for performance over a time horizon at high level, thus resulting in a bi-level optimization routine. In the absence of knowledge of the set of feasible parameters, we develop a Recursive Feasibility Guided Gradient Descent approach for updating the parameters of QP so that the new solution performs at least as well as previous solution. By considering the dynamical system as a directed graph over time, this work presents a novel way of optimizing performance of a QP controller over a time horizon for multiple CBFs by (1) using the gradient of its solution with respect to its parameters by employing sensitivity analysis, and (2) backpropagating these as well as system dynamics gradients to update parameters while maintaining feasibility of QPs.
翻译:实施控制屏障功能的“二次曲线”程序(QPs)在安全-关键控制合成中变得受欢迎,部分原因是它们易于执行和制约性规格。但是,有效的二次曲线框架的构建并非直截了当,而对于任意选择的质量控制参数,系统轨迹可能进入QP最终变得不可行或无法达到预期性能的状态。在这项工作中,我们将控制合成问题作为一种差异政策提出,该差异政策的参数在高层次的时空范围内最优化,从而形成双级优化常规。在对一套可行参数缺乏了解的情况下,我们开发了一种更新QP参数的再精确可行性引导梯层法,以使新的解决方案至少能像以前一样发挥作用。通过将动态系统视为一个指导性图表,这项工作为在多个碳基金的时间跨度上优化质量控制控制器的性能提供了一种新的方式,即(1) 使用其解决方案的梯度,通过使用敏感度分析来更新QP的参数,(2) 并更新这些变异度的参数,同时保持这些变异性能系统。