Sentence compression reduces the length of text by removing non-essential content while preserving important facts and grammaticality. Unsupervised objective driven methods for sentence compression can be used to create customized models without the need for ground-truth training data, while allowing flexibility in the objective function(s) that are used for learning and inference. Recent unsupervised sentence compression approaches use custom objectives to guide discrete search; however, guided search is expensive at inference time. In this work, we explore the use of reinforcement learning to train effective sentence compression models that are also fast when generating predictions. In particular, we cast the task as binary sequence labelling and fine-tune a pre-trained transformer using a simple policy gradient approach. Our approach outperforms other unsupervised models while also being more efficient at inference time.
翻译:句子压缩通过删除非必要内容来缩短文本长度,同时保留重要的事实和语法性。 未经监督的客观的句子压缩方法可以用来创建定制模型,而不需要地面真实性培训数据,同时允许用于学习和推断的客观功能的灵活性。 最近未经监督的句子压缩方法使用定制目标来指导离散搜索; 然而, 在推断时间上, 引导搜索费用昂贵。 在这项工作中, 我们探索如何使用强化学习来培训有效的句子压缩模型, 这些模型在生成预测时也是快速的。 特别是, 我们用简单的政策梯度方法将任务作为二进制顺序标签, 并微调一个经过预先训练的变压器。 我们的方法优于其他不受监督的模式, 同时在推断时间上也更有效率 。