Robot learning of real-world manipulation tasks remains challenging and time consuming, even though actions are often simplified by single-step manipulation primitives. In order to compensate the removed time dependency, we additionally learn an image-to-image transition model that is able to predict a next state including its uncertainty. We apply this approach to bin picking, the task of emptying a bin using grasping as well as pre-grasping manipulation as fast as possible. The transition model is trained with up to 42000 pairs of real-world images before and after a manipulation action. Our approach enables two important skills: First, for applications with flange-mounted cameras, picks per hours (PPH) can be increased by around 15% by skipping image measurements. Second, we use the model to plan action sequences ahead of time and optimize time-dependent rewards, e.g. to minimize the number of actions required to empty the bin. We evaluate both improvements with real-robot experiments and achieve over 700 PPH in the YCB Box and Blocks Test.
翻译:机器人学习真实世界操作任务仍具有挑战性和耗时性, 尽管操作通常通过单步操纵原始程序简化行动。 为了弥补去除的时间依赖性, 我们还学习了一个图像到图像的过渡模型, 该模型能够预测下一个状态, 包括不确定性。 我们用这个方法来挑选垃圾, 尽可能快地利用抓取和预抓操纵来清空一个垃圾桶。 过渡模型在操作行动之前和之后, 训练了多达42000对真实世界图像。 我们的方法可以提供两种重要技能 : 首先, 对于使用Flang- 挂起的相机的应用, 每小时提取( PPH) 可以通过跳过图像测量来增加大约15% 。 其次, 我们使用这个模型来提前规划行动序列, 并优化基于时间的回报, 例如, 尽可能减少清除垃圾箱所需的行动数量 。 我们用实时机器人实验来评估改进情况, 在 YCB Box 和 Blocks 测试中实现700多个 PPH 。