Energy-function-based safety certificates can provide provable safety guarantees for the safe control tasks of complex robotic systems. However, all recent studies about learning-based energy function synthesis only consider the feasibility, which might cause over-conservativeness and result in less efficient controllers. In this work, we proposed the magnitude regularization technique to improve the efficiency of safe controllers by reducing the conservativeness inside the energy function while keeping the promising provable safety guarantees. Specifically, we quantify the conservativeness by the magnitude of the energy function, and we reduce the conservativeness by adding a magnitude regularization term to the synthesis loss. We propose the SafeMR algorithm that uses reinforcement learning (RL) for the synthesis to unify the learning processes of safe controllers and energy functions. Experimental results show that the proposed method does reduce the conservativeness of the energy functions and outperforms the baselines in terms of the controller efficiency while guaranteeing safety.
翻译:以能源功能为基础的安全证书可以为复杂的机器人系统的安全控制任务提供可行的安全保障。然而,最近关于基于学习的能源功能合成的所有研究都只考虑可行性,这可能造成过度保守和低效率控制器。在这项工作中,我们提出了规模规范化技术,通过降低能源功能内部的保守性来提高安全控制器的效率,同时保留有希望的可实现的安全保障。具体地说,我们用能源功能的大小来量化保守性,并通过在合成损失中增加一个数量级规范化术语来减少保守性。我们建议采用安全MR算法,使用强化学习法(RL)来综合统一安全控制器和能源功能的学习过程。实验结果表明,拟议的方法确实降低了能源功能的保守性,在控制器效率方面超过了基线,同时保证安全。