Large pre-trained language models (LM) based on Transformers allow to generate very plausible long texts. In this paper, we explore how this generation can be further controlled to satisfy certain constraints (eg. being non-toxic, positive or negative, convey certain emotions, etc.) without fine-tuning the LM. Precisely, we formalize constrained generation as a tree exploration process guided by a discriminator according to how well the associated sequence respects the constraint. Using a discriminator to guide this generation, rather than fine-tuning the LM, in addition to be easier and cheaper to train, allows to apply the constraint more finely and dynamically. We propose several original methods to search this generation tree, notably the Monte Carlo Tree Search (MCTS) which provides theoretical guarantees on the search efficiency, but also simpler methods based on re-ranking a pool of diverse sequences using the discriminator scores. We evaluate these methods on two types of constraints and languages: review polarity and emotion control in French and English. We show that MCTS achieves state-of-the-art results in constrained generation, without having to tune the language model, in both tasks and languages. We also demonstrate that our other proposed methods based on re-ranking can be really effective when diversity among the generated propositions is encouraged.
翻译:在本文中,我们探索如何进一步控制这一代人,以满足某些限制(如无毒、阳性或负性,传达某些情绪等),而不微调LM。 确切地说,我们将受限制的一代人正规化为由歧视者指导的树木勘探过程,根据相关顺序对制约的制约程度,将这种受限制的代代代人视为受歧视者指导的树木勘探过程。使用歧视者指导这一代人,而不是微调LMM,除了更简单、更便宜的训练外,还允许更精细、更有活力地运用限制。我们提出了几种原始方法来搜索这代人,特别是蒙特卡洛树搜索(MCTS),它提供了搜索效率的理论保证,但也提供了更简单的方法,它基于使用歧视分数重新排列不同序列的集合。我们评估了这两种制约和语言的方法:审查中法语和英语的极地和情绪控制。我们表明,MTS在受限制的代人中取得了最先进的成果,而不必在真正的任务和语言模式中调整语言模式。 我们还表明,我们提出的其他方法可以鼓励在真正形成多样化时,在形成之前采用其他方法。