While pretrained language models (PLMs) have greatly improved text generation, they have also been known to produce unfaithful or inappropriate content. In contrast, classic template-based systems provide strong guarantees of faithfulness at the cost of fluency. We propose TempLM, which achieves the best of both worlds by distilling a PLM into a template-based generator. On the E2E and SynthBio data-to-text datasets, we show that TempLM is more faithful than the original PLM and is more fluent than prior template systems. Notably, on an out-of-domain evaluation, TempLM reduces a finetuned BART model's unfaithfulness rate from 83% to 0%. In a human study, we find that TempLM's templates substantially improve upon human-written ones in BERTScore.
翻译:虽然预先培训的语言模型(PLM)大大改进了文本生成,但人们也知道它们产生了不真实或不适当的内容。相反,传统的基于模板的系统以流利为代价,为忠诚提供了有力的保证。我们建议TERLM,它通过将一个PLM提炼到基于模板的生成器中来达到两个世界的最佳效果。在E2E和SynthBio的数据到文本数据集上,我们显示TERM比原始的PLM更忠实,比以前的模板系统更流畅。值得注意的是,在外表评估中,TEMLM将一个经过微调的BART模型的不忠诚率从83%降低到0%。在一项人类研究中,我们发现TERPLM的模板大大改进了BERSTScore的人写模板。