MetaGraspNet:通过物理元合成软件提取场景-仓储Aware Abirdextictive bin 大型基准数据集 (MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware Ambidextrous Bin Picking via Physics-based Metaverse Synthesis)

Autonomous bin picking poses significant challenges to vision-driven robotic systems given the complexity of the problem, ranging from various sensor modalities, to highly entangled object layouts, to diverse item properties and gripper types. Existing methods often address the problem from one perspective. Diverse items and complex bin scenes require diverse picking strategies together with advanced reasoning. As such, to build robust and effective machine-learning algorithms for solving this complex task requires significant amounts of comprehensive and high quality data. Collecting such data in real world would be too expensive and time prohibitive and therefore intractable from a scalability perspective. To tackle this big, diverse data problem, we take inspiration from the recent rise in the concept of metaverses, and introduce MetaGraspNet, a large-scale photo-realistic bin picking dataset constructed via physics-based metaverse synthesis. The proposed dataset contains 217k RGBD images across 82 different article types, with full annotations for object detection, amodal perception, keypoint detection, manipulation order and ambidextrous grasp labels for a parallel-jaw and vacuum gripper. We also provide a real dataset consisting of over 2.3k fully annotated high-quality RGBD images, divided into 5 levels of difficulties and an unseen object set to evaluate different object and layout properties. Finally, we conduct extensive experiments showing that our proposed vacuum seal model and synthetic dataset achieves state-of-the-art performance and generalizes to real world use-cases.

翻译：鉴于问题的复杂性,自动取垃圾器对视觉驱动的机器人系统提出了重大挑战,从各种传感器模式到高度缠绕的物体布局,从高度缠绕的物体布局到不同种类的物品特性和抓抓类型等各种复杂问题的复杂性,现有方法往往从一个角度解决问题。多样化的项目和复杂的垃圾场面需要不同的选择策略以及先进的推理。因此,要建立强有力和有效的机器学习算法来解决这一复杂的任务,需要大量全面和高质量的数据。在现实世界中收集这类数据将过于昂贵和时间过紧,因此从可伸缩的角度难以解决。要解决这个大而多样的数据问题,我们还要从最近出现的元变形概念中汲取灵感,并引入MetaGraspNet,这是一个大型的摄影现实型的书包,通过基于物理的现代合成合成来收集数据集。因此,拟议的数据集包含217k RGBD 图像,涉及82种不同物品的探测、调控、关键点检测、操纵顺序和精确的掌握标签,从可缩缩放的角度处理。我们还提供真实的数据集出超过2.3k的超高等级的、高等级的图像,以显示我们高等级的、高等级的图像,最后显示我们所拟的、高等级的、高等级的图像。