优雅失灵:学习神经网络控制器极不安全 (Failing with Grace: Learning Neural Network Controllers that are Boundedly Unsafe)

In this work, we consider the problem of learning a feed-forward neural network (NN) controller to safely steer an arbitrarily shaped planar robot in a compact and obstacle-occluded workspace. Unlike existing methods that depend strongly on the density of data points close to the boundary of the safe state space to train NN controllers with closed-loop safety guarantees, we propose an approach that lifts such assumptions on the data that are hard to satisfy in practice and instead allows for graceful safety violations, i.e., of a bounded magnitude that can be spatially controlled. To do so, we employ reachability analysis methods to encapsulate safety constraints in the training process. Specifically, to obtain a computationally efficient over-approximation of the forward reachable set of the closed-loop system, we partition the robot's state space into cells and adaptively subdivide the cells that contain states which may escape the safe set under the trained control law. To do so, we first design appropriate under- and over-approximations of the robot's footprint to adaptively subdivide the configuration space into cells. Then, using the overlap between each cell's forward reachable set and the set of infeasible robot configurations as a measure for safety violations, we introduce penalty terms into the loss function that penalize this overlap in the training process. As a result, our method can learn a safe vector field for the closed-loop system and, at the same time, provide numerical worst-case bounds on safety violation over the whole configuration space, defined by the overlap between the over-approximation of the forward reachable set of the closed-loop system and the set of unsafe states. Moreover, it can control the tradeoff between computational complexity and tightness of these bounds. Finally, we provide a simulation study that verifies the efficacy of the proposed scheme.

翻译：在这项工作中,我们考虑了学习一个向导神经网络(NN)控制器的问题,以便在一个紧凑和障碍封闭的工作空间中安全地引导一个任意成形的平板机器人。与目前非常依赖靠近安全状态空间边界的数据点密度的当前方法不同,我们用封闭环安全保障措施对NN控制器进行培训,我们建议一种方法,在数据上解除这种难以在实践中满足的数据假设,而允许优雅的安全侵犯,即,在空间上可以控制一个封闭的内装尺寸。为了做到这一点,我们采用了可到达性分析方法,在培训过程中将安全限制包含任意成型的平板机器人机器人机器人机器人机器人机器人。我们首先设计了最深层和超深层的内装的内装精度,我们使用最深的内装的内装系统将可调整的内装安全性功能将机能的内置的内置、内置的内置、内置、内置、内置、内置的内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、内置、