Masked diffusion language models (MDLMs) generate text through an iterative denoising process. They have recently gained attention due to mask-parallel decoding and competitive performance with autoregressive large language models. However, effective mechanisms for inference-time control and steering in MDLMs remain largely unexplored. We present an activation-steering framework for MDLMs that computes layer-wise steering vectors from a single forward pass using contrastive examples, without simulating the denoising trajectory. These directions are applied at every reverse-diffusion step, yielding an efficient inference-time control mechanism. Experiments on LLaDA-8B-Instruct demonstrate reliable modulation of high-level attributes, with ablations examining the effects of steering across transformer sub-modules and token scope (prompt vs.\ response).
翻译:掩蔽扩散语言模型(MDLMs)通过迭代去噪过程生成文本。它们最近因掩蔽并行解码能力以及与自回归大语言模型相竞争的性能而受到关注。然而,MDLMs中有效的推理时控制与导向机制在很大程度上仍未得到探索。我们提出了一种用于MDLMs的激活导向框架,该框架通过对比示例在单次前向传播中计算层间导向向量,无需模拟去噪轨迹。这些方向在每一个反向扩散步骤中被应用,从而形成一种高效的推理时控制机制。在LLaDA-8B-Instruct上的实验证明了其对高层属性的可靠调控能力,并通过消融实验检验了导向在Transformer子模块及词元范围(提示与响应)上的影响。