Conditional graphic layout generation, which generates realistic layouts according to user constraints, is a challenging task that has not been well-studied yet. First, there is limited discussion about how to handle diverse user constraints flexibly and uniformly. Second, to make the layouts conform to user constraints, existing work often sacrifices generation quality significantly. In this work, we propose LayoutFormer++ to tackle the above problems. First, to flexibly handle diverse constraints, we propose a constraint serialization scheme, which represents different user constraints as sequences of tokens with a predefined format. Then, we formulate conditional layout generation as a sequence-to-sequence transformation, and leverage encoder-decoder framework with Transformer as the basic architecture. Furthermore, to make the layout better meet user requirements without harming quality, we propose a decoding space restriction strategy. Specifically, we prune the predicted distribution by ignoring the options that definitely violate user constraints and likely result in low-quality layouts, and make the model samples from the restricted distribution. Experiments demonstrate that LayoutFormer++ outperforms existing approaches on all the tasks in terms of both better generation quality and less constraint violation.
翻译:条件图形布局生成是一个具有挑战性、尚未深入研究的任务,其根据用户约束生成真实的布局。目前,关于如何灵活统一地处理不同用户约束的讨论非常有限。此外,为了使生成的布局符合用户约束,现有的工作往往会显著牺牲生成质量。在本文中,我们提出了LayoutFormer++来解决上述问题。首先,为了灵活地处理不同约束,我们提出了一个约束序列化方案,将不同的用户约束表示为一个预定义格式的标记序列。然后,我们将有条件的布局生成视为序列到序列的转换,并采用以Transformer为基本架构的编码器-解码器框架。此外,为了使布局更好地符合用户要求而不损害质量,我们提出了一种解码空间限制策略。具体来说,我们通过忽略显然违反用户约束且可能导致低质量布局的选项来修剪预测分布,并使模型从受限制的分布中进行采样。实验表明,LayoutFormer++在所有任务中表现优于现有方法,既可以提高生成质量,又可以减少约束违反。