During learning trials, systems are exposed to different failure conditions which may break robotic parts before a safe behavior is discovered. Humans contour this problem by grounding their learning to a safer structure/control first and gradually increasing its difficulty. This paper presents the impact of a similar supports in the learning of a stable gait on a quadruped robot. Based on the psychological theory of instructional scaffolding, we provide different support settings to our robot, evaluated with strain gauges, and use Bayesian Optimization to conduct a parametric search towards a stable Raibert controller. We perform several experiments to measure the relation between constant supports and gradually reduced supports during gait learning, and our results show that a gradually reduced support is capable of creating a more stable gait than a support at a fixed height. Although gaps between simulation and reality can lead robots to catastrophic failures, our proposed method combines speed and safety when learning a new behavior.
翻译:在学习试验期间,系统暴露于不同的故障条件下,在发现安全行为之前可能会打破机器人部件。 人类通过首先将其学习建立在更安全的结构/控制上,逐渐增加难度来看待这一问题。 本文介绍了在学习四重机器人的稳定步态方面类似支持的影响。 根据教学脚手架的心理理论,我们向我们的机器人提供不同的支持设置,用压力计数器进行评估,并利用Bayesian Opitimization对稳定的 Raibert 控制器进行参数搜索。 我们在练习中进行数项实验,以测量常态支持与逐渐减少支持之间的关系,结果显示逐渐减少的支持能够创造比固定高度支持更稳定的步态。 尽管模拟与现实之间的差距可能导致机器人发生灾难性的失败,但我们提出的方法在学习新行为时将速度和安全结合起来。