Dexterous manipulation of arbitrary objects, a fundamental daily task for humans, has been a grand challenge for autonomous robotic systems. Although data-driven approaches using reinforcement learning can develop specialist policies that discover behaviors to control a single object, they often exhibit poor generalization to unseen ones. In this work, we show that policies learned by existing reinforcement learning algorithms can in fact be generalist when combined with multi-task learning and a well-chosen object representation. We show that a single generalist policy can perform in-hand manipulation of over 100 geometrically-diverse real-world objects and generalize to new objects with unseen shape or size. Interestingly, we find that multi-task learning with object point cloud representations not only generalizes better but even outperforms the single-object specialist policies on both training as well as held-out test objects. Video results at https://huangwl18.github.io/geometry-dex
翻译:任意天体是人类日常的一项基本任务,对任意天体进行非直接操纵是自主机器人系统的一个重大挑战。虽然使用强化学习的数据驱动方法可以制定发现控制单一天体的行为的专家政策,但它们往往对看不见天体的概括化程度较差。 在这项工作中,我们表明,现有强化学习算法所学的政策,如果与多任务学习和精心选取的天体代表相结合,实际上可以是通用的。我们显示,单一的通俗政策可以对100多个地球学上的不同现实世界天体进行手动操纵,并概括化以看不见形状或大小的新天体。有趣的是,我们发现,用目标点云表显示的多任务性学习不仅比较一般化,甚至比单目标点专家政策在培训上和悬置试验对象上都优。视频结果见https://huangwl18.github.io/geology-dex。