It is desirable to enable robots capable of automatic assembly. Structural understanding of object parts plays a crucial role in this task yet remains relatively unexplored. In this paper, we focus on the setting of furniture assembly from a complete set of part geometries, which is essentially a 6-DoF part pose estimation problem. We propose a multi-layer transformer-based framework that involves geometric and relational reasoning between parts to update the part poses iteratively. We carefully design a unique instance encoding to solve the ambiguity between geometrically-similar parts so that all parts can be distinguished. In addition to assembling from scratch, we extend our framework to a new task called in-process part assembly. Analogous to furniture maintenance, it requires robots to continue with unfinished products and assemble the remaining parts into appropriate positions. Our method achieves far more than 10% improvements over the current state-of-the-art in multiple metrics on the public PartNet dataset. Extensive experiments and quantitative comparisons demonstrate the effectiveness of the proposed framework.
翻译:启用能够自动组装的机器人是可取的。 对对象部件的结构理解在此任务中起着关键作用,但相对来说仍然未探索。 在本文中,我们侧重于从完整的一组部分地貌设置家具组装,这基本上是一个6-DoF部分造成的估计问题。我们提议了一个多层变压器框架,其中涉及各部件之间的几何和关联推理来更新部件,以迭接方式更新部件。我们仔细设计了一个独特的实例编码,以解决几何相似部件之间的模糊之处,以便区分所有部件。我们除了从零开始组装外,还将我们的框架扩大到一个叫作过程部分组装的新任务。与家具维修相比,它要求机器人继续使用未完成的产品,并将其余部件组装到适当的位置。我们的方法比公共部分网数据集的多个指标的当前状态取得了超过10%的改进。广泛的实验和定量比较显示了拟议框架的有效性。