Transfer learning from synthetic to real data has been proved an effective way of mitigating data annotation constraints in various computer vision tasks. However, the developments focused on 2D images but lag far behind for 3D point clouds due to the lack of large-scale high-quality synthetic point cloud data and effective transfer methods. We address this issue by collecting SynLiDAR, a synthetic LiDAR point cloud dataset that contains large-scale point-wise annotated point cloud with accurate geometric shapes and comprehensive semantic classes, and designing PCT-Net, a point cloud translation network that aims to narrow down the gap with real-world point cloud data. For SynLiDAR, we leverage graphic tools and professionals who construct multiple realistic virtual environments with rich scene types and layouts where annotated LiDAR points can be generated automatically. On top of that, PCT-Net disentangles synthetic-to-real gaps into an appearance component and a sparsity component and translates SynLiDAR by aligning the two components with real-world data separately. Extensive experiments over multiple data augmentation and semi-supervised semantic segmentation tasks show very positive outcomes - including SynLiDAR can either train better models or reduce real-world annotated data without sacrificing performance, and PCT-Net translated data further improve model performance consistently.
翻译:事实证明,从合成数据向实际数据转移学习是减轻各种计算机愿景任务的数据注释限制的有效方法,然而,由于缺乏大规模高质量的合成云数据和有效的传输方法,以2D图像为重点,但在3D点云方面却远远落后于3D点云层。我们通过收集合成的LiDAR点云数据集SynLiDAR来解决这一问题,该数据集包含一个具有精确几何形状和全面语义分类的、带有大量点点度附加值云,并设计了PCT-网络,这是一个点云转换网络,目的是缩小与真实世界云数据的差距。对于SynLiDAR来说,我们利用图形工具和专业人员来建立多种现实的虚拟环境,这些环境具有丰富的场景类型和布局,可以自动生成附加说明的LIDAR点。此外,PCT-Net将合成-现实差距分解成一个大型的外观组成部分和音频部分,并将SyLiDAR翻译成,将两个组成部分与真实世界数据分开。在多个数据扩增和半超超度的断层数据模型上进行广泛的实验,并且不断改进的运行结果,包括不断改进的数据。