Despite significant progress in understanding and improving faithfulness in abstractive summarization, the question of how decoding strategies affect faithfulness is less studied. We present a systematic study of the effect of generation techniques such as beam search and nucleus sampling on faithfulness in abstractive summarization. We find a consistent trend where beam search with large beam sizes produces the most faithful summaries while nucleus sampling generates the least faithful ones. We propose two faithfulness-aware generation methods to further improve faithfulness over current generation techniques: (1) ranking candidates generated by beam search using automatic faithfulness metrics and (2) incorporating lookahead heuristics that produce a faithfulness score on the future summary. We show that both generation methods significantly improve faithfulness across two datasets as evaluated by four automatic faithfulness metrics and human evaluation. To reduce computational cost, we demonstrate a simple distillation approach that allows the model to generate faithful summaries with just greedy decoding. Our code is publicly available at https://github.com/amazon-science/faithful-summarization-generation
翻译:尽管在理解和增进抽象总结方面的忠诚方面取得了显著进展,但对解码战略如何影响忠诚的问题的研究较少。我们系统地研究了代代技术的影响,例如光束搜索和岩心取样对抽象总结中的忠诚性的影响。我们发现一个一致的趋势,即以大梁面积的梁线搜索产生最忠实的摘要,而核心取样产生最不忠实的摘要。我们提出两种忠诚-觉悟的一代方法,以进一步提高对当代技术的忠诚性:(1) 利用自动忠诚度度量标准对通过波音搜索产生的候选人进行排序,(2) 将造就未来摘要的忠诚得分的目头黑血学纳入其中。我们表明,两种代方法都大大提高了由四种自动忠诚度度量度和人类评价所评估的两个数据集的忠诚性。为了降低计算成本,我们展示了一种简单的蒸馏方法,使模型能够产生忠实的总结,只是贪婪解码。我们的代码在https://githuub.com/amazon-science/faith-sumatfri-sumatrial-thation-degraphen-dation-degraphenation-dation-dation-dation)。</s>