The basis of many object manipulation algorithms is RGB-D input. Yet, commodity RGB-D sensors can only provide distorted depth maps for a wide range of transparent objects due light refraction and absorption. To tackle the perception challenges posed by transparent objects, we propose TranspareNet, a joint point cloud and depth completion method, with the ability to complete the depth of transparent objects in cluttered and complex scenes, even with partially filled fluid contents within the vessels. To address the shortcomings of existing transparent object data collection schemes in literature, we also propose an automated dataset creation workflow that consists of robot-controlled image collection and vision-based automatic annotation. Through this automated workflow, we created Toronto Transparent Objects Depth Dataset (TODD), which consists of nearly 15000 RGB-D images. Our experimental evaluation demonstrates that TranspareNet outperforms existing state-of-the-art depth completion methods on multiple datasets, including ClearGrasp, and that it also handles cluttered scenes when trained on TODD. Code and dataset will be released at https://www.pair.toronto.edu/TranspareNet/
翻译:许多物体操纵算法的基础是 RGB-D 输入。然而,商品 RGB-D 传感器只能为各种透明对象提供扭曲的深度深度图,以进行各种透明对象的光度反射和吸收。为了应对透明对象造成的感知挑战,我们提议TranspareNet,这是一个联合点云和深度完成方法,能够在混合和复杂的场景中完成透明对象的深度,即使船只内含有部分填充的液体内容。为了解决文献中现有的透明对象数据收集方案的缺点,我们还提议建立一个自动数据集创建工作流程,其中包括机器人控制的图像收集和基于视觉的自动注释。我们通过这一自动化工作流程创建了多伦多透明对象深度数据集,其中包括近15,000 RGB-D 图像。我们的实验评估表明,TranspareNet超越了包括ClearGraspsps在内的多数据集的现有最先进的深度完成方法,而且在对托德进行训练时,它也会处理模糊的场景象。代码和数据集将在https://www.pair.to.edu/transpareNet/dreNet上发布。