We propose a general and efficient framework to control auto-regressive generation models with NeurAlly-Decomposed Oracle (NADO). Given a pre-trained base language model and a sequence-level boolean oracle function, we propose to decompose the oracle function into token-level guidance to steer the base model in text generation. Specifically, the token-level guidance is approximated by a neural model trained with examples sampled from the base model, demanding no additional auxiliary labeled data. We present the closed-form optimal solution to incorporate the token-level guidance into the base model for controllable generation. We further provide a theoretical analysis of how the approximation quality of NADO affects the controllable generation results. Experiments conducted on two applications: (1) text generation with lexical constraints and (2) machine translation with formality control demonstrate that our framework efficiently guides the base model towards the given oracle while maintaining high generation quality.
翻译:我们提议了一个控制自动递减生成模型的一般性有效框架,与NeurAlly-Droposed Oracle(NADO)共同控制自动递减生成模型。鉴于一个经过事先训练的基础语言模型和一个序列级布尔兰或魔器功能,我们提议将神器函数分解为象征性的指导,以指导基本模型的文本生成。具体地说,象征性指导被一个神经模型所近似,该模型从基础模型中抽样培训,要求不要求额外的辅助标签数据。我们提出了将象征性水平指南纳入可控生成模型的封闭式最佳解决方案。我们进一步提供了对NADO近似质量如何影响可控生成结果的理论分析。在两种应用上进行的实验:(1) 具有词汇限制的文本生成和(2) 具有形式控制的机器翻译,表明我们的框架有效地引导基础模型走向给定的神器,同时保持高代质量。