Despite the impressive progress achieved in robust grasp detection, robots are not skilled in sophisticated grasping tasks (e.g. search and grasp a specific object in clutter). Such tasks involve not only grasping, but comprehensive perception of the visual world (e.g. the relationship between objects). Recently, the advanced deep learning techniques provide a promising way for understanding the high-level visual concepts. It encourages robotic researchers to explore solutions for such hard and complicated fields. However, deep learning usually means data-hungry. The lack of data severely limits the performance of deep-learning-based algorithms. In this paper, we present a new dataset named \regrad to sustain the modeling of relationships among objects and grasps. We collect the annotations of object poses, segmentations, grasps, and relationships in each image for comprehensive perception of grasping. Our dataset is collected in both forms of 2D images and 3D point clouds. Moreover, since all the data are generated automatically, users are free to import their own object models for the generation of as many data as they want. We have released our dataset and codes. A video that demonstrates the process of data generation is also available.
翻译:尽管在强力捕捉探测方面取得了令人印象深刻的进展,但机器人在精密捕捉任务(例如,搜寻和捕捉一个具体对象在混乱中)方面技能不高,这种任务不仅涉及掌握,而且涉及对视觉世界的全面认识(例如,物体之间的关系)。最近,先进的深层次学习技术为了解高层次视觉概念提供了很有希望的方法。它鼓励机器人研究人员探索解决这种硬性和复杂领域的办法。然而,深层次学习通常意味着数据饥饿。缺乏数据严重限制了深层学习算法的性能。在本文中,我们提出了一个名为\regrad的新数据集,以维持物体和掌握者之间关系的建模。我们收集每个图像中对象构成、分割、掌握和关系的说明,以便全面了解掌握。我们的数据集是以2D图像和3D点云两种形式收集的。此外,由于所有数据都是自动生成的,用户可以随意进口自己的对象模型来生成许多数据。我们发布了数据集和代码。一个显示数据生成过程的视频也可供使用。