Autonomous excavation is a challenging task. The unknown contact dynamics between the excavator bucket and the terrain could easily result in large contact forces and jamming problems during excavation. Traditional model-based methods struggle to handle such problems due to complex dynamic modeling. In this paper, we formulate the excavation skills with three novel manipulation primitives. We propose to learn the manipulation primitives with offline reinforcement learning (RL) to avoid large amounts of online robot interactions. The proposed method can learn efficient penetration skills from sub-optimal demonstrations, which contain sub-trajectories that can be ``stitched" together to formulate an optimal trajectory without causing jamming. We evaluate the proposed method with extensive experiments on excavating a variety of rigid objects and demonstrate that the learned policy outperforms the demonstrations. We also show that the learned policy can quickly adapt to unseen and challenging fragmented rocks with online fine-tuning.
翻译:自主开挖是一项具有挑战性的任务。挖掘斗和地形之间的未知接触动力学容易导致挖掘过程中发生大量接触力和卡住的问题。由于复杂的动力学建模,传统的基于模型的方法难以处理这样的问题。在本文中,我们用三种新颖的操作原语来制定挖掘技能。我们建议通过离线强化学习(RL)学习操作原语,以避免大量在线机器人交互。所提出的方法可以从次优演示中学习有效的穿透技能,其中包含可以“拼接”成最优轨迹的子轨迹,而不会造成卡住。我们通过对挖掘各种刚性物体的广泛实验来评估所提出的方法,并证明所学策略的性能优于演示。我们还展示了学习策略如何快速适应不可见和具有挑战性的碎石并进行在线微调。