Masked Diffusion Models (MDMs) have emerged as one of the most promising paradigms for generative modeling over discrete domains. It is known that MDMs effectively train to decode tokens in a random order, and that this ordering has significant performance implications in practice. This observation raises a fundamental question: can we design a training framework that optimizes for a favorable decoding order? We answer this in the affirmative, showing that the continuous-time variational objective of MDMs, when equipped with multivariate noise schedules, can identify and optimize for a decoding order during training. We establish a direct correspondence between decoding order and the multivariate noise schedule and show that this setting breaks invariance of the MDM objective to the noise schedule. Furthermore, we prove that the MDM objective decomposes precisely into a weighted auto-regressive losses over these orders, which establishes them as auto-regressive models with learnable orders.
翻译:掩码扩散模型已成为离散域生成建模中最具前景的范式之一。已知MDMs通过随机顺序解码令牌进行有效训练,且该顺序在实践中对性能具有显著影响。这一观察引出了一个根本性问题:能否设计一种训练框架以优化解码顺序?我们对此给出肯定答案,证明配备多元噪声调度方案的MDMs连续时间变分目标可在训练过程中识别并优化解码顺序。我们建立了解码顺序与多元噪声调度方案之间的直接对应关系,并证明该设定打破了MDM目标对噪声调度方案的不变性。此外,我们严格论证了MDM目标可精确分解为这些顺序上的加权自回归损失,从而将其确立为具有可学习顺序的自回归模型。