This letter presents contact-safe Model-based Reinforcement Learning (MBRL) for robot applications that achieves contact-safe behaviors in the learning process. In typical MBRL, we cannot expect the data-driven model to generate accurate and reliable policies to the intended robotic tasks during the learning process due to sample scarcity. Operating these unreliable policies in a contact-rich environment could cause damage to the robot and its surroundings. To alleviate the risk of causing damage through unexpected intensive physical contacts, we present the contact-safe MBRL that associates the probabilistic Model Predictive Control's (pMPC) control limits with the model uncertainty so that the allowed acceleration of controlled behavior is adjusted according to learning progress. Control planning with such uncertainty-aware control limits is formulated as a deterministic MPC problem using a computation-efficient approximated GP dynamics and an approximated inference technique. Our approach's effectiveness is evaluated through bowl mixing tasks with simulated and real robots, scooping tasks with a real robot as examples of contact-rich manipulation skills. (video: https://youtu.be/LfzYhJaHies)
翻译:本信介绍了在学习过程中实现接触安全行为的机器人应用的接触安全模型强化学习(MBRL) 。 在典型的 MBRL 中,我们无法期望数据驱动模型在学习过程中,由于样本稀缺,能够产生准确和可靠的政策来应对预期的机器人任务。 在接触丰富的环境中操作这些不可靠的政策可能会对机器人及其周围环境造成损害。为了减轻通过意外密集的物理接触造成破坏的风险,我们介绍了接触安全模型强化学习(MBRL),该模型将概率模型预测控制限与模型不确定性联系起来,以便允许控制行为的加速根据学习进展进行调整。使用计算高效的近似GP动态和近似推断技术来制定具有不确定性的控制控制限的MPC问题。我们的方法的有效性是通过模拟和真实机器人的碗混合任务进行评估,将任务与真正的机器人连接,作为接触丰富的操纵技能的实例。 (视频: https://youtu.be/LfzyyyyyyyjaHies) (视频: https://youtube/LfzyyyjaHies)