Generating visual layouts is an essential ingredient of graphic design. The ability to condition layout generation on a partial subset of component attributes is critical to real-world applications that involve user interaction. Recently, diffusion models have demonstrated high-quality generative performances in various domains. However, it is unclear how to apply diffusion models to the natural representation of layouts which consists of a mix of discrete (class) and continuous (location, size) attributes. To address the conditioning layout generation problem, we introduce DLT, a joint discrete-continuous diffusion model. DLT is a transformer-based model which has a flexible conditioning mechanism that allows for conditioning on any given subset of all the layout component classes, locations, and sizes. Our method outperforms state-of-the-art generative models on various layout generation datasets with respect to different metrics and conditioning settings. Additionally, we validate the effectiveness of our proposed conditioning mechanism and the joint continuous-diffusion process. This joint process can be incorporated into a wide range of mixed discrete-continuous generative tasks.
翻译:生成视觉布局是图形设计的一个基本组成部分。 使布局生成具备部分组件属性的能力对于涉及用户互动的现实世界应用程序至关重要。 最近, 扩散模型在多个领域展示了高质量的基因化性能。 但是, 如何将扩散模型应用于布局的自然代表形式尚不清楚, 包括离散( 类) 和连续( 地点、 大小) 特性的组合。 为了解决调控布局生成问题, 我们引入了调控( DLT), 一个联合离散连续的传播模型。 DLT 是一个基于变压器的模型, 它有一个灵活的调制机制, 能够调节所有布局组件类别、 位置和大小的任何特定子集。 我们的方法优于不同度和调控环境的各种布局生成数据集的状态型式。 此外, 我们验证了我们提议的调控机制的有效性和联合连续融合过程。 这个联合过程可以纳入广泛的混合不连续互通的互连成型的基因化任务。</s>