Grasping an object when it is in an ungraspable pose is a challenging task, such as books or other large flat objects placed horizontally on a table. Inspired by human manipulation, we address this problem by pushing the object to the edge of the table and then grasping it from the hanging part. In this paper, we develop a model-free Deep Reinforcement Learning framework to synergize pushing and grasping actions. We first pre-train a Variational Autoencoder to extract high-dimensional features of input scenario images. One Proximal Policy Optimization algorithm with the common reward and sharing layers of Actor-Critic is employed to learn both pushing and grasping actions with high data efficiency. Experiments show that our one network policy can converge 2.5 times faster than the policy using two parallel networks. Moreover, the experiments on unseen objects show that our policy can generalize to the challenging case of objects with curved surfaces and off-center irregularly shaped objects. Lastly, our policy can be transferred to a real robot without fine-tuning by using CycleGAN for domain adaption and outperforms the push-to-wall baseline.
翻译:当一个对象处于不可辨别的外形时, 刻刻刻一个对象是一项艰巨的任务, 例如书籍或其他大型平板物体横向放在桌面上。 在人类操作的启发下, 我们通过将对象推到桌面边缘, 然后从悬浮部分抓住它来解决这个问题。 在本文中, 我们开发一个无模型的深强化学习框架, 以协同推力和抓取动作。 我们第一次预演一个蒸汽自动编码器, 以提取输入场景图像的高维度特征 。 使用一种具有共同奖赏和共享 Actor- Critic 层的准政策优化算法, 来学习高数据效率的推力和握力动作 。 实验显示, 我们的一个网络政策可以比使用两个平行网络的政策快2.5倍。 此外, 对隐形物体的实验显示, 我们的政策可以概括为具有曲线表和偏转偏偏偏的物体这个具有挑战性的案例 。 最后, 我们的政策可以转移到一个真正的机器人, 而不通过使用 CypecleGAN 来调整和超越推向墙基线 。</s>