As large-scale language model pretraining pushes the state-of-the-art in text generation, recent work has turned to controlling attributes of the text such models generate. While modifying the pretrained models via fine-tuning remains the popular approach, it incurs a significant computational cost and can be infeasible due to lack of appropriate data. As an alternative, we propose MuCoCO -- a flexible and modular algorithm for controllable inference from pretrained models. We formulate the decoding process as an optimization problem which allows for multiple attributes we aim to control to be easily incorporated as differentiable constraints to the optimization. By relaxing this discrete optimization to a continuous one, we make use of Lagrangian multipliers and gradient-descent based techniques to generate the desired text. We evaluate our approach on controllable machine translation and style transfer with multiple sentence-level attributes and observe significant improvements over baselines.
翻译:随着大规模语言模型的预修阶段,随着大规模语言模型在文本生成过程中的先进程度,最近的工作转向了控制这种模型产生的文本属性。通过微调修改经过预先训练的模型仍然是流行的做法,但它在计算上成本很高,而且由于缺乏适当数据而可能不可行。作为替代办法,我们提议采用MoCoCO -- -- 一种灵活和模块化的算法,用于控制经过预先训练的模型的可控推理。我们把解码过程发展成一个优化问题,使多种属性能够很容易地被整合为对优化的不同制约。我们通过将这种离散的优化放松到连续的状态,我们利用拉格朗格的乘数和基于梯度的计算技术来生成理想文本。我们评估了我们用多种句级属性进行可控机器翻译和风格转换的方法,并观察了基线的显著改进。