This letter presents contact-safe Model-based Reinforcement Learning (MBRL) for robot applications that achieves contact-safe behaviors in the learning process. In typical MBRL, we cannot expect the data-driven model to generate accurate and reliable policies to the intended robotic tasks during the learning process due to sample scarcity. Operating these unreliable policies in a contact-rich environment could cause damage to the robot and its surroundings. To alleviate the risk of causing damage through unexpected intensive physical contacts, we present the contact-safe MBRL that associates the probabilistic Model Predictive Control's (pMPC) control limits with the model uncertainty so that the allowed acceleration of controlled behavior is adjusted according to learning progress. Control planning with such uncertainty-aware control limits is formulated as a deterministic MPC problem using a computation-efficient approximated GP dynamics and an approximated inference technique. Our approach's effectiveness is evaluated through bowl mixing tasks with simulated and real robots, scooping tasks with a real robot as examples of contact-rich manipulation skills. (video: https://youtu.be/sdhHP3NhYi0)
翻译:本信介绍了在学习过程中实现接触安全行为的机器人应用的接触安全模型强化学习(MBRL) 。 在典型的 MBRL 中,我们无法期望数据驱动模型在学习过程中,由于样本稀缺,能够产生准确和可靠的政策来应对预期的机器人任务。 在接触丰富的环境中操作这些不可靠的政策可能会对机器人及其周围环境造成损害。为了减轻通过意外密集的物理接触造成损害的风险,我们介绍了接触安全模型(MBRL),它将概率模型预测控制(PMPC) 的控制限度与模型不确定性联系起来,以便允许受控行为的加速根据学习进展进行调整。具有这种不确定性-觉悟控制限度的控制规划被设计为确定性的MPC 问题, 使用一种计算高效的近似GP动态和近似近似推断技术。我们的方法的有效性是通过与模拟和真实的机器人混合碗来评估的。 将任务与真正的机器人连接起来作为接触丰富的操纵技能的例子。 (视频: https://yotu.be/sdhHH3NYi0)