Object rearranging is one of the most common deformable manipulation tasks, where the robot needs to rearrange a deformable object into a goal configuration. Previous studies focus on designing an expert system for each specific task by model-based or data-driven approaches and the application scenarios are therefore limited. Some research has been attempting to design a general framework to obtain more advanced manipulation capabilities for deformable rearranging tasks, with lots of progress achieved in simulation. However, transferring from simulation to reality is difficult due to the limitation of the end-to-end CNN architecture. To address these challenges, we design a local GNN (Graph Neural Network) based learning method, which utilizes two representation graphs to encode keypoints detected from images. Self-attention is applied for graph updating and cross-attention is applied for generating manipulation actions. Extensive experiments have been conducted to demonstrate that our framework is effective in multiple 1-D (rope, rope ring) and 2-D (cloth) rearranging tasks in simulation and can be easily transferred to a real robot by fine-tuning a keypoint detector.
翻译:对象重新排列是最常见的变形操作任务之一, 机器人需要将一个变形物体重新排列成一个目标配置。 先前的研究重点是通过模型或数据驱动的方法设计一个专家系统, 因此应用设想有限。 一些研究试图设计一个总体框架, 以获得更先进的变形重新排列任务操作能力, 并在模拟中取得了许多进展。 但是, 由于有线电视新闻网终端至终端结构的局限性, 将模拟转换为现实是困难的。 为了应对这些挑战, 我们设计了一个基于本地 GNN( Graph Neal 网络) 的学习方法, 这种方法使用两个代表图来编码从图像中检测到的键点。 自我注意用于图形更新和交叉注意生成操纵行动。 已经进行了广泛的实验, 以证明我们的框架在多个 1D (rope、 绳圈) 和 2D (lof) 重新排列任务在模拟中是有效的, 并且可以通过微调一个关键点探测器很容易转移到真正的机器人 。