In this study, we leverage the deliberate and systematic fault-injection capabilities of an open-source benchmark suite to perform a series of experiments on state-of-the-art deep and robust reinforcement learning algorithms. We aim to benchmark robustness in the context of continuous action spaces -- crucial for deployment in robot control. We find that robustness is more prominent for action disturbances than it is for disturbances to observations and dynamics. We also observe that state-of-the-art approaches that are not explicitly designed to improve robustness perform at a level comparable to that achieved by those that are. Our study and results are intended to provide insight into the current state of safe and robust reinforcement learning and a foundation for the advancement of the field, in particular, for deployment in robotic systems.
翻译:在本研究中,我们利用一个开放源码基准套件的蓄意和系统错误输入能力,对最新的深固强化学习算法进行一系列实验。我们的目标是在连续行动空间 -- -- 对机器人控制的部署至关重要 -- -- 中衡量稳健性,我们发现强性对于行动干扰比对于对观察和动态的干扰更为突出。我们还注意到,最先进的方法没有明确设计来提高稳健性,使其达到与所达到的水平相当的水平。我们的研究和结果旨在深入了解安全和稳健强化学习的现状,并为推进实地工作,特别是机器人系统的部署奠定基础。