PartSLIP:通过预先培训的图像语言模型为三维点云进行低热部分分割 (PartSLIP: Low-Shot Part Segmentation for 3D Point Clouds via Pretrained Image-Language Models)

Generalizable 3D part segmentation is important but challenging in vision and robotics. Training deep models via conventional supervised methods requires large-scale 3D datasets with fine-grained part annotations, which are costly to collect. This paper explores an alternative way for low-shot part segmentation of 3D point clouds by leveraging a pretrained image-language model, GLIP, which achieves superior performance on open-vocabulary 2D detection. We transfer the rich knowledge from 2D to 3D through GLIP-based part detection on point cloud rendering and a novel 2D-to-3D label lifting algorithm. We also utilize multi-view 3D priors and few-shot prompt tuning to boost performance significantly. Extensive evaluation on PartNet and PartNet-Mobility datasets shows that our method enables excellent zero-shot 3D part segmentation. Our few-shot version not only outperforms existing few-shot approaches by a large margin but also achieves highly competitive results compared to the fully supervised counterpart. Furthermore, we demonstrate that our method can be directly applied to iPhone-scanned point clouds without significant domain gaps.

翻译：3D 部分分割在视觉和机器人方面很重要,但具有挑战性。通过常规监督方法对深海模型进行培训需要大型的 3D 数据集,配有精细的批注,收集成本很高。本文探索了利用预先训练的图像语言模型GLIP对3D点云进行低射部分分割的替代方法,GLIP在公开弹道2D探测上取得了优异的性能。我们通过点云传输的GLIP 部分探测和新型的 2D 至 3D 标签提升算法,将丰富的知识从 2D 转移到 3D 3D 部分。我们还利用多视图 3D 前置和几发快速调来显著提高性能。对 PartNet 和 PartNet 移动数据集的广泛评价表明,我们的方法可以很好地实现零射 3D 部分分割。我们的短片版本不仅大大地超越了现有的几发方法,而且与完全监督的对应方相比也取得了高度竞争性的结果。此外,我们证明我们的方法可以直接应用于iPhone-scned 点云,没有显著的域差距。

相关内容

点云

关注 48

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日