We present ShapeCrafter, a neural network for recursive text-conditioned 3D shape generation. Existing methods to generate text-conditioned 3D shapes consume an entire text prompt to generate a 3D shape in a single step. However, humans tend to describe shapes recursively-we may start with an initial description and progressively add details based on intermediate results. To capture this recursive process, we introduce a method to generate a 3D shape distribution, conditioned on an initial phrase, that gradually evolves as more phrases are added. Since existing datasets are insufficient for training this approach, we present Text2Shape++, a large dataset of 369K shape-text pairs that supports recursive shape generation. To capture local details that are often used to refine shape descriptions, we build on top of vector-quantized deep implicit functions that generate a distribution of high-quality shapes. Results show that our method can generate shapes consistent with text descriptions, and shapes evolve gradually as more phrases are added. Our method supports shape editing, extrapolation, and can enable new applications in human-machine collaboration for creative design.
翻译:我们为循环文本附加 3D 形状生成提供了一个神经网络 。 生成文本附加 3D 形状的现有方法消耗了整个文本, 在一个步骤中生成一个 3D 形状。 然而, 人类倾向于以初始描述开始描述形状, 并逐步根据中间结果添加细节 。 为了捕捉此循环过程, 我们引入了一个生成 3D 形状分布的方法, 以初始短语为条件, 随着更多短语的添加而逐渐演变 。 由于现有的数据集不足以培训这个方法, 我们呈现了 Text2Shape++, 是一个支持递归形状生成的369K 形状组合的庞大数据集 。 为了捕捉取通常用于完善形状描述的本地细节, 我们以矢量的深度隐含功能为顶层, 产生高质量形状的分布 。 结果显示, 我们的方法可以生成与文本描述相一致的形状, 并且随着更多短语的添加而逐渐演变 。 我们的方法支持形状编辑、 外推, 并且能够为创造性设计提供新的人类机器合作应用 。