There has been increasing interest in smart factories powered by robotics systems to tackle repetitive, laborious tasks. One impactful yet challenging task in robotics-powered smart factory applications is robotic grasping: using robotic arms to grasp objects autonomously in different settings. Robotic grasping requires a variety of computer vision tasks such as object detection, segmentation, grasp prediction, pick planning, etc. While significant progress has been made in leveraging of machine learning for robotic grasping, particularly with deep learning, a big challenge remains in the need for large-scale, high-quality RGBD datasets that cover a wide diversity of scenarios and permutations. To tackle this big, diverse data problem, we are inspired by the recent rise in the concept of metaverse, which has greatly closed the gap between virtual worlds and the physical world. Metaverses allow us to create digital twins of real-world manufacturing scenarios and to virtually create different scenarios from which large volumes of data can be generated for training models. In this paper, we present MetaGraspNet: a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis. The proposed dataset contains 100,000 images and 25 different object types and is split into 5 difficulties to evaluate object detection and segmentation model performance in different grasping scenarios. We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance in a manner that is more appropriate for robotic grasp applications compared to existing general-purpose performance metrics. Our benchmark dataset is available open-source on Kaggle, with the first phase consisting of detailed object detection, segmentation, layout annotations, and a layout-weighted performance metric script.
翻译:机器人系统推动智能工厂应对重复的、艰巨的任务,人们越来越关注机器人系统驱动的智能工厂。机器人动力智能工厂应用中一个影响巨大但却具有挑战性的任务就是机器人掌握:使用机器人臂在不同环境中自主掌握物体。机器人掌握需要各种各样的计算机视觉任务,如物体探测、分割、预知、预测、选取规划等。虽然在利用机器学习进行机器人掌握方面已经取得重大进展,特别是深思熟虑,但在大规模、高品质的 RGBD 数据集方面仍然面临着巨大的挑战,该数据集涵盖多种假设和变异。要解决这一大而多样的数据问题,我们受到新变化概念的启发,它大大缩小了虚拟世界和物理世界之间的差距。模型使我们能够创造真实世界制造情景的数字双胞胎,并且实际上创造不同的假设,从中可以产生大量的数据来用于培训模型。在本文中,我们首次展示MetagraspNet:一个大型基准数据集,用于通过基于物理的标准化的常规合成合成,用于视觉驱动的机器人掌握大型的机器人应用。我们从新版本概念开始,一个不同版本的运行阶段的运行状态,另一个阶段是不同的业绩分析,另一个阶段是不同版本数据,另一个阶段,另一个阶段是用来进行不同的业绩探测。