Diffusion based approaches to long form text generation suffer from prohibitive computational cost and memory overhead as sequence length increases. We introduce SA-DiffuSeq, a diffusion framework that integrates sparse attention to fundamentally improve scalability for long document modeling. By selectively allocating attention within the diffusion process, SA-DiffuSeq significantly reduces computational complexity while maintaining semantic coherence and generation quality. A key component of our method is a soft absorbing state tailored to sparse attention dynamics, which stabilizes diffusion trajectories and accelerates sequence reconstruction. This design improves sampling efficiency and enhances precision in long range dependency modeling. Extensive experiments demonstrate that SA-DiffuSeq consistently surpasses state of the art diffusion baselines in both training efficiency and sampling speed, with especially strong gains on extended sequences. These properties make SA-DiffuSeq well suited for demanding long form applications such as scientific writing, large scale code generation, and multi turn long context dialogue. Overall, our results indicate that incorporating structured sparsity into diffusion models is a promising direction for efficient and expressive long text generation.
翻译:基于扩散模型的长文本生成方法,随着序列长度的增加,会面临计算成本过高和内存开销过大的问题。我们提出了SA-DiffuSeq,这是一个集成了稀疏注意力的扩散框架,旨在从根本上提升长文档建模的可扩展性。通过在扩散过程中有选择性地分配注意力,SA-DiffuSeq在保持语义连贯性和生成质量的同时,显著降低了计算复杂度。我们方法的一个关键组成部分是为稀疏注意力动态定制的软吸收状态,它能稳定扩散轨迹并加速序列重建。这一设计提高了采样效率,并增强了对长距离依赖建模的精度。大量实验表明,SA-DiffuSeq在训练效率和采样速度上均持续超越最先进的扩散基线模型,尤其是在处理超长序列时优势更为显著。这些特性使得SA-DiffuSeq非常适合要求苛刻的长文本应用,如科学写作、大规模代码生成以及多轮长上下文对话。总体而言,我们的结果表明,将结构化稀疏性引入扩散模型是实现高效且富有表现力的长文本生成的一个有前景的方向。