With the growing interest in deep learning algorithms and computational design in the architectural field, the need for large, accessible and diverse architectural datasets increases. We decided to tackle this problem by constructing a field-specific synthetic data generation pipeline that generates an arbitrary amount of 3D data along with the associated 2D and 3D annotations. The variety of annotations, the flexibility to customize the generated building and dataset parameters make this framework suitable for multiple deep learning tasks, including geometric deep learning that requires direct 3D supervision. Creating our building data generation pipeline we leveraged architectural knowledge from experts in order to construct a framework that would be modular, extendable and would provide a sufficient amount of class-balanced data samples. Moreover, we purposefully involve the researcher in the dataset customization allowing the introduction of additional building components, material textures, building classes, number and type of annotations as well as the number of views per 3D model sample. In this way, the framework would satisfy different research requirements and would be adaptable to a large variety of tasks. All code and data are made publicly available.
翻译:随着对建筑领域深层学习算法和计算设计的兴趣日益浓厚,对大型、无障碍和多样化建筑数据集的需求增加,我们决定通过建造一个针对实地的合成数据生成管道来解决这一问题,该管道将产生任意数量的3D数据和相关的2D和3D说明。说明的多样性、定制生成的建筑和数据集参数的灵活性,使这一框架适合多重深层学习任务,包括需要直接3D监督的几何深学习。创建我们的建筑数据生成管道,我们利用专家的建筑知识来构建一个模块化、可扩展和提供足够的分类平衡数据样本的框架。此外,我们有意让研究人员参与数据集的定制,以便引入更多的建筑构件、材料纹理、建筑分类、说明的数量和类型,以及每个3D模型样本的视图数量。这样,框架将满足不同的研究要求,并适应大量的任务。所有代码和数据都是公开提供的。