Learning to generate 3D point clouds without 3D supervision is an important but challenging problem. Current solutions leverage various differentiable renderers to project the generated 3D point clouds onto a 2D image plane, and train deep neural networks using the per-pixel difference with 2D ground truth images. However, these solutions are still struggling to fully recover fine structures of 3D shapes, such as thin tubes or planes. To resolve this issue, we propose an unsupervised approach for 3D point cloud generation with fine structures. Specifically, we cast 3D point cloud learning as a 2D projection matching problem. Rather than using entire 2D silhouette images as a regular pixel supervision, we introduce structure adaptive sampling to randomly sample 2D points within the silhouettes as an irregular point supervision, which alleviates the consistency issue of sampling from different view angles. Our method pushes the neural network to generate a 3D point cloud whose 2D projections match the irregular point supervision from different view angles. Our 2D projection matching approach enables the neural network to learn more accurate structure information than using the per-pixel difference, especially for fine and thin 3D structures. Our method can recover fine 3D structures from 2D silhouette images at different resolutions, and is robust to different sampling methods and point number in irregular point supervision. Our method outperforms others under widely used benchmarks. Our code, data and models are available at https://github.com/chenchao15/2D\_projection\_matching.
翻译:学习如何在没有 3D 监管下生成 3D 点云是一个重要但具有挑战性的问题。 当前解决方案将各种不同的解介器用于将生成的 3D 点云投射到 2D 图像平面上, 并用 2D 地面真实图像来培训深神经网络。 然而, 这些解决方案仍然在努力要完全恢复 3D 形状的精细结构, 如薄管或平面。 为了解决这个问题, 我们建议对 3D 点云生成采用不受监督的方法, 且有精细结构 。 具体地说, 我们将 3D 点云作为 2D 投影匹配问题。 而不是将整个 2D 点的图像投射成一个常规的像素平平面监督器, 而是将 2D 点的随机采样器作为不规则性检测器。 我们的方法推向神经网络生成一个 3D 点, 2D 代码与不同角度的不规则的监控器 。 我们的2D 匹配方法使得神经网络能够学习比使用 3D 的精确的结构信息,,, 以不同的采集方法, 和 不同的取样方法 。