We consider safety in simultaneous learning and control of discrete-time linear time-invariant systems. We provide rigorous confidence bounds on the learned model of the system based on the number of utilized state measurements. These bounds are used to modify control inputs to the system via an optimization problem with potentially time-varying safety constraints. We prove that the state can only exit the safe set with small probability, provided a feasible solution to the safety-constrained optimization exists. This optimization problem is then reformulated in a more computationally-friendly format by tightening the safety constraints to account for model uncertainty during learning. The tightening decreases as the confidence in the learned model improves. We finally prove that, under persistence of excitation, the tightening becomes negligible as more measurements are gathered.
翻译:我们考虑在同时学习和控制离散时间线性时变系统时的安全性。我们根据使用状态测量的数量,对该系统的学习模式提供严格的信任界限。这些界限用于通过可能时间变化的安全限制的优化问题修改对系统的控制投入。我们证明,国家只能以很小的概率退出安全套套套,但安全限制的优化必须有一个可行的解决办法。然后,通过强化安全限制,考虑到学习过程中的模式不确定性,将优化问题改写成一种更有利于计算的格式。随着对学习模式的信心的提高,收紧程度会降低。我们最后证明,在持续的刺激下,随着更多的测量的收集,收紧程度变得微不足道。