With the rising industrial attention to 3D virtual modeling technology, generating novel 3D content based on specified conditions (e.g. text) has become a hot issue. In this paper, we propose a new generative 3D modeling framework called Diffusion-SDF for the challenging task of text-to-shape synthesis. Previous approaches lack flexibility in both 3D data representation and shape generation, thereby failing to generate highly diversified 3D shapes conforming to the given text descriptions. To address this, we propose a SDF autoencoder together with the Voxelized Diffusion model to learn and generate representations for voxelized signed distance fields (SDFs) of 3D shapes. Specifically, we design a novel UinU-Net architecture that implants a local-focused inner network inside the standard U-Net architecture, which enables better reconstruction of patch-independent SDF representations. We extend our approach to further text-to-shape tasks including text-conditioned shape completion and manipulation. Experimental results show that Diffusion-SDF is capable of generating both high-quality and highly diversified 3D shapes that conform well to the given text descriptions. Diffusion-SDF has demonstrated its superiority compared to previous state-of-the-art text-to-shape approaches.
翻译:随着工业对3D虚拟模型技术的日益重视,根据特定条件(如文本)生成新型的3D内容成为了一个热点问题。在本文中,我们提议一个新的3D型基因模型框架,名为“Difulte-SDF”,用于具有挑战性的文本到成像合成任务。以前的方法在3D数据代表性和形状生成方面缺乏灵活性,从而未能产生与给定文本描述相符的高度多样化的3D形状。为此,我们建议SDF自动编码器与Voxelized 扩散模型一道,学习并生成3D形状的蒸发签名远程场(SDFs)的演示。具体地说,我们设计了一个新型 UinU-Net结构,在标准 U-Net结构中植入一个以本地为重点的内部网络,从而能够更好地重建依赖拼凑的 SDFS 代表。我们把我们的方法推广到进一步的文本到形状上的任务,包括以文字固定的形状完成和操纵。实验结果显示,Dif-SDFDF能够生成高品质和高度多样化的3D形状,而其之前的优势比S-Prof-laf-Prof-Proferentalization-dex-ferviews) 演示了以前的文本描述。