In this paper, we are interested in the problem of generating target grasps by understanding freehand sketches. The sketch is useful for the persons who cannot formulate language and the cases where a textual description is not available on the fly. However, very few works are aware of the usability of this novel interactive way between humans and robots. To this end, we propose a method to generate a potential grasp configuration relevant to the sketch-depicted objects. Due to the inherent ambiguity of sketches with abstract details, we take the advantage of the graph by incorporating the structure of the sketch to enhance the representation ability. This graph-represented sketch is further validated to improve the generalization of the network, capable of learning the sketch-queried grasp detection by using a small collection (around 100 samples) of hand-drawn sketches. Additionally, our model is trained and tested in an end-to-end manner which is easy to be implemented in real-world applications. Experiments on the multi-object VMRD and GraspNet-1Billion datasets demonstrate the good generalization of the proposed method. The physical robot experiments confirm the utility of our method in object-cluttered scenes.
翻译:在本文中,我们关心的是通过理解自由手动草图生成目标抓图的问题。素描对于无法制作语言的人有用,对于无法绘制文字描述的人有用;然而,很少有作品意识到人类和机器人之间这种新型互动方式的可用性。为此,我们提出一种方法来产生与素描淡化对象相关的潜在抓图配置。由于草图带有抽象细节的内在模糊性,我们利用图图图来结合草图结构来提高代表能力。这个图形代表的素描进一步得到验证,以改善网络的概括化,从而能够通过使用少量手绘草图收集(大约100个样本)来学习素描拼图的抓图。此外,我们的模型是以最终到最终的方式加以培训和测试的,这在现实世界应用中是容易实施的。多目标VMRD和 GraspetNet-1Billion数据集的实验展示了拟议方法的良好概括性。物理机器人实验证实了我们在目标片片场中的方法的实用性。