Learning-based control schemes have recently shown great efficacy performing complex tasks. However, in order to deploy them in real systems, it is of vital importance to guarantee that the system will remain safe during online training and execution. We therefore need safe online learning frameworks able to autonomously reason about whether the current information at their disposal is enough to ensure safety or, in contrast, new measurements are required. In this paper, we present a framework consisting of two parts: first, an out-of-distribution detection mechanism actively collecting measurements when needed to guarantee that at least one safety backup direction is always available for use; and second, a Gaussian Process-based probabilistic safety-critical controller that ensures the system stays safe at all times with high probability. Our method exploits model knowledge through the use of Control Barrier Functions, and collects measurements from the stream of online data in an event-triggered fashion to guarantee recursive feasibility of the learned safety-critical controller. This, in turn, allows us to provide formal results of forward invariance of a safe set with high probability, even in a priori unexplored regions. Finally, we validate the proposed framework in numerical simulations of an adaptive cruise control system.
翻译:以学习为基础的控制机制最近显示出了履行复杂任务的巨大功效,然而,为了在实际系统中部署这些功能,至关重要的是保证该系统在在线培训和执行期间仍然安全。因此,我们需要安全的在线学习框架,能够自主地说明其所掌握的现有信息是否足以确保安全,或者相反,需要新的测量。在本文件中,我们提出了一个由两部分组成的框架:第一,在需要时积极收集分配外检测机制,以积极收集测量数据,保证至少有一个安全备份方向可供使用;第二,基于高斯安程序的安全性临界控制器,确保系统在任何可能时都保持安全。我们的方法通过使用控制屏障功能来利用模型知识,并在万不得已的情况下从在线数据流中收集测量数据,以保证所学的安全临界控制器的再生可行性。这反过来使我们能够提供即使在以前没有爆炸的区域也极有可能提供一套安全安全套的预置结果。最后,我们验证了拟议的导航系统适应性数字模拟框架。