安全控制 -- -- 控制 -- -- 制度:安全学习控制和强化学习统一基准套套 (safe-control-gym: a Unified Benchmark Suite for Safe Learning-based Control and Reinforcement Learning)

In recent years, reinforcement learning and learning-based control -- as well as the study of their safety, crucial for deployment in real-world robots -- have gained significant traction. However, to adequately gauge the progress and applicability of new results, we need the tools to equitably compare the approaches proposed by the controls and reinforcement learning communities. Here, we propose a new open-source benchmark suite, called safe-control-gym. Our starting point is OpenAI's Gym API, which is one of the de facto standard in reinforcement learning research. Yet, we highlight the reasons for its limited appeal to control theory researchers -- and safe control, in particular. E.g., the lack of analytical models and constraint specifications. Thus, we propose to extend this API with (i) the ability to specify (and query) symbolic models and constraints and (ii) introduce simulated disturbances in the control inputs, measurements, and inertial properties. We provide implementations for three dynamic systems -- the cart-pole, 1D, and 2D quadrotor -- and two control tasks -- stabilization and trajectory tracking. To demonstrate our proposal -- and in an attempt to bring research communities closer together -- we show how to use safe-control-gym to quantitatively compare the control performance, data efficiency, and safety of multiple approaches from the areas of traditional control, learning-based control, and reinforcement learning.

翻译：近年来,强化学习和基于学习的控制 -- -- 以及对其安全性的研究 -- -- 对实际世界机器人的部署至关重要 -- -- 已经取得了显著的进展。然而,为了充分衡量新成果的进展和适用性,我们需要工具来公平比较控制和强化学习界提议的方法。在这里,我们提出一个新的开放源基准套件,称为安全控制-陀螺。我们的出发点是OpenAI's Gym API,这是加强学习研究的一个事实上的标准。然而,我们强调其对理论研究人员和安全控制(特别是:例如,缺乏分析模型和限制规格)的吸引力有限的原因。因此,我们提议扩大这一API,使之(一) 能够说明(和查询) 象征性模式和制约因素,(二) 在控制投入、测量和惯性特性方面引入模拟的干扰。我们为三个动态系统 -- -- 木器、1D和2D 夸德罗托尔 -- -- 以及两个控制任务 -- -- 如何稳定和轨迹跟踪 -- -- 展示我们的建议 -- -- 和跟踪。为了使研究界更接近、更接近于安全、更精确地使用业绩控制的方法 -- -- -- -- -- -- 展示我们如何使用加强和比较的学习的方法。