神经形状编译器：一种在文本、点云和程序之间转换的统一框架 (Neural Shape Compiler: A Unified Framework for Transforming between Text, Point Cloud, and Program)

3D shapes have complementary abstractions from low-level geometry to part-based hierarchies to languages, which convey different levels of information. This paper presents a unified framework to translate between pairs of shape abstractions: $\textit{Text}$ $\Longleftrightarrow$ $\textit{Point Cloud}$ $\Longleftrightarrow$ $\textit{Program}$. We propose $\textbf{Neural Shape Compiler}$ to model the abstraction transformation as a conditional generation process. It converts 3D shapes of three abstract types into unified discrete shape code, transforms each shape code into code of other abstract types through the proposed $\textit{ShapeCode Transformer}$, and decodes them to output the target shape abstraction. Point Cloud code is obtained in a class-agnostic way by the proposed $\textit{Point}$VQVAE. On Text2Shape, ShapeGlot, ABO, Genre, and Program Synthetic datasets, Neural Shape Compiler shows strengths in $\textit{Text}$ $\Longrightarrow$ $\textit{Point Cloud}$, $\textit{Point Cloud}$ $\Longrightarrow$ $\textit{Text}$, $\textit{Point Cloud}$ $\Longrightarrow$ $\textit{Program}$, and Point Cloud Completion tasks. Additionally, Neural Shape Compiler benefits from jointly training on all heterogeneous data and tasks.

翻译：3D形状有不同的抽象层次，从低级几何到基于部件的层次到语言，传达不同层次的信息。本文提出了一种统一的框架来在形状抽象之间进行转换: 文本 $\Longleftrightarrow$ 点云 $\Longleftrightarrow$ 程序。我们提出神经形状编译器来将抽象转换建模为一种条件生成过程。它将三种抽象类型的3D形状转换为统一的离散形状代码，并通过所提出的形状代码变换器将每个形状代码转换为其他抽象类型的代码，然后进行解码以输出目标形状抽象。点云代码通过所提出的VQVAE获得类无关性。在 Text2Shape、ShapeGlot、ABO、Genre 和程序合成数据集上，神经形状编译器在文本 $\Longrightarrow$ 点云，点云 $\Longrightarrow$ 文本，点云 $\Longrightarrow$ 程序和点云完成任务方面表现出了优势。此外，神经形状编译器受益于在所有异构数据和任务上进行联合训练的方法。

相关内容

点云

关注 49

根据激光测量原理得到的点云，包括三维坐标（XYZ）和激光反射强度（Intensity）。根据摄影测量原理得到的点云，包括三维坐标（XYZ）和颜色信息（RGB）。结合激光测量和摄影测量原理得到点云，包括三维坐标（XYZ）、激光反射强度（Intensity）和颜色信息（RGB）。在获取物体表面每个采样点的空间坐标后，得到的是一个点的集合，称之为“点云”(Point Cloud)

【CVPR2023】NS3D：3D对象和关系的神经符号Grounding

专知会员服务

22+阅读 · 2023年3月26日

【CVPR 2022】可控图像合成与编辑的合成生成先验学习，SemanticStyleGAN: Learning Compositonal Generative Priors for Controllable Image Synthesis and Editing

专知会员服务

23+阅读 · 2022年3月3日

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【ACL2020】用于生成深度问题的语义图，Semantic Graphs for Generating Deep Questions

专知会员服务

26+阅读 · 2020年5月5日