In recent years, both reinforcement learning and learning-based control -- as well as the study of their safety, which is crucial for deployment in real-world robots -- have gained significant traction. However, to adequately gauge the progress and applicability of new results, we need the tools to equitably compare the approaches proposed by the controls and reinforcement learning communities. Here, we propose a new open-source benchmark suite, called safe-control-gym, supporting both model-based and data-based control techniques. We provide implementations for three dynamic systems -- the cart-pole, the 1D, and 2D quadrotor -- and two control tasks -- stabilization and trajectory tracking. We propose to extend OpenAI's Gym API -- the de facto standard in reinforcement learning research -- with (i) the ability to specify (and query) symbolic dynamics and (ii) constraints, and (iii) (repeatably) inject simulated disturbances in the control inputs, state measurements, and inertial properties. To demonstrate our proposal and in an attempt to bring research communities closer together, we show how to use safe-control-gym to quantitatively compare the control performance, data efficiency, and safety of multiple approaches from the fields of traditional control, learning-based control, and reinforcement learning.
翻译:近年来,强化学习和基于学习的控制 -- -- 以及安全研究 -- -- 对实际世界机器人的部署至关重要 -- -- 已经取得了显著的进展。然而,为了充分衡量新成果的进展和适用性,我们需要工具来公平比较控制者和强化学习社区提议的方法。在这里,我们提议一个新的开放源基准套件,称为安全控制-控制套件,支持基于模型和基于数据的控制技术。我们提供三种动态系统 -- -- 手推车球、1D和2D四角塔 -- -- 和两项控制任务 -- -- 稳定化和轨迹跟踪 -- -- 的实施。我们提议扩大OpenAI的Gym API -- -- 强化学习研究的实际标准 -- --,并(一) 说明(和查询) 象征性动态和(二) 制约的能力,以及(三) (可更新) 输入控制投入、状态测量和惯性特性的模拟干扰。为了展示我们的提议,并试图使研究社区更加接近,我们的提议,我们提议如何使用安全控制-控制方法,从数量上比较基于控制、数据、安全性和强化控制方法的多重学习领域。