Learning Enabled Components (LEC) have greatly assisted cyber-physical systems in achieving higher levels of autonomy. However, LEC's susceptibility to dynamic and uncertain operating conditions is a critical challenge for the safety of these systems. Redundant controller architectures have been widely adopted for safety assurance in such contexts. These architectures augment LEC "performant" controllers that are difficult to verify with "safety" controllers and the decision logic to switch between them. While these architectures ensure safety, we point out two limitations. First, they are trained offline to learn a conservative policy of always selecting a controller that maintains the system's safety, which limits the system's adaptability to dynamic and non-stationary environments. Second, they do not support reverse switching from the safety controller to the performant controller, even when the threat to safety is no longer present. To address these limitations, we propose a dynamic simplex strategy with an online controller switching logic that allows two-way switching. We consider switching as a sequential decision-making problem and model it as a semi-Markov decision process. We leverage a combination of a myopic selector using surrogate models (for the forward switch) and a non-myopic planner (for the reverse switch) to balance safety and performance. We evaluate this approach using an autonomous vehicle case study in the CARLA simulator using different driving conditions, locations, and component failures. We show that the proposed approach results in fewer collisions and higher performance than state-of-the-art alternatives.
翻译:学习辅助元件( LEC) 极大地帮助了网络物理系统实现更高程度的自主。 但是, LEC 容易受动态和不确定的操作条件的影响是这些系统安全的关键挑战。 在这种环境下,对安全保障广泛采用了冗余控制器架构。 这些架构增加了LEC “ 性能” 控制器, 难以用“ 安全” 控制器和它们之间转换的决定逻辑来进行校验。 虽然这些架构确保了安全,但我们指出两个限制。 首先, 它们受过培训,以学习一项保守政策,即始终选择一个维持系统安全的控制器,从而限制系统对动态和非静止环境的适应性能。 其次,它们不支持从安全控制器向性能控制器逆向转换,即使对安全的威胁不再存在。 为了解决这些限制,我们提出了一个动态简单化战略,使用在线控制器转换逻辑,允许双向转换。 我们考虑将转换作为顺序决策问题,并将它作为半马尔科夫决策程序。 我们利用替代选择器组合组合, 使用隐性选择器选择系统对动态和非静止环境的适应模型,, 使用前置换式的性操作模型, 以显示自动性性性性动作的系统性变换式系统变换式的性变换式系统变换式的状态, 。