Quadratic Program(QP) based state-feedback controllers, whose inequality constraints bound the rate of change of control barrier(CBFs) and lyapunov function with a class-$\mathcal{K}$ function of their values, are sensitive to the parameters of these class-$\mathcal{K}$ functions. The construction of valid CBFs, however, is not straightforward, and for arbitrarily chosen parameters of the QP, the system trajectories may enter states at which the QP either eventually becomes infeasible, or may not achieve desired performance. In this work, we pose the control synthesis problem as a differential policy whose parameters are optimized for performance over a time horizon at high level, thus resulting in a bi-level optimization routine. In the absence of knowledge of the set of feasible parameters, we develop a Recursive Feasibility Guided Gradient Descent approach for updating the parameters of QP so that the new solution performs at least as well as previous solution. By considering the dynamical system as a directed graph over time, this work presents a novel way of optimizing performance of a QP controller over a time horizon for multiple CBFs by (1) using the gradient of its solution with respect to its parameters by employing sensitivity analysis, and (2) backpropagating these as well as system dynamics gradients to update parameters while maintaining feasibility of QPs.
翻译:以 Quadratic 程序 (QP) 为基础的基于 州- 州- 州- 州( QP) 的调控控制控制控制器( CBFs), 其不平等性制约与控制屏障( CBFs) 和 lyapunov 函数的变化率的变动率( CDFs) 和 liapunov 函数的值值的值值的值值值的变动率相关, 对这些等级- $\ mathcal{ K} 的值函数十分敏感。 然而, 建造有效的 CBF 参数并非直截了当的, 而对于任意选择 QP 的参数, 系统轨迹可进入 QP 参数最终变得不可行, 或无法达到预期的性能。 在这项工作中, 我们将控制综合参数作为一个差异政策, 其参数的参数在高时空范围内得到优化, 导致双级优化。 在不了解一套可行参数的情况下, 我们开发一个更精确的 性 引导 底底底线 方法, 来保持 将 更新 C 。