Safety has become one of the main challenges of applying deep reinforcement learning to real world systems. Currently, the incorporation of external knowledge such as human oversight is the only means to prevent the agent from visiting the catastrophic state. In this paper, we propose MBHI, a novel framework for safe model-based reinforcement learning, which ensures safety in the state-level and can effectively avoid both "local" and "non-local" catastrophes. An ensemble of supervised learners are trained in MBHI to imitate human blocking decisions. Similar to human decision-making process, MBHI will roll out an imagined trajectory in the dynamics model before executing actions to the environment, and estimate its safety. When the imagination encounters a catastrophe, MBHI will block the current action and use an efficient MPC method to output a safety policy. We evaluate our method on several safety tasks, and the results show that MBHI achieved better performance in terms of sample efficiency and number of catastrophes compared to the baselines.
翻译:安全已成为对现实世界系统应用深度强化学习的主要挑战之一。目前,将外部知识,如人类监督等纳入外部知识是防止代理人访问灾难性状态的唯一手段。在本论文中,我们提议MBHI,这是一个安全模型强化学习的新框架,它确保州一级的安全,能够有效避免“局部”和“非本地”灾难。一组受监督学习者在MBHI中接受培训,以模仿人类阻塞决定。与人类决策过程类似,MBHI在对环境采取行动之前,将动态模型的想象轨迹推出,并估计其安全性。当想象力遇到灾难时,MBHI将阻止当前行动,使用高效的MPC方法制定安全政策。我们评估了我们几项安全任务的方法,结果显示MBHI在抽样效率和灾害数量方面比基线取得更好的业绩。