学习总结人类反馈 (Learning to summarize from human feedback)

As language models become more powerful, training and evaluation are increasingly bottlenecked by the data and metrics used for a particular task. For example, summarization models are often trained to predict human reference summaries and evaluated using ROUGE, but both of these metrics are rough proxies for what we really care about -- summary quality. In this work, we show that it is possible to significantly improve summary quality by training a model to optimize for human preferences. We collect a large, high-quality dataset of human comparisons between summaries, train a model to predict the human-preferred summary, and use that model as a reward function to fine-tune a summarization policy using reinforcement learning. We apply our method to a version of the TL;DR dataset of Reddit posts and find that our models significantly outperform both human reference summaries and much larger models fine-tuned with supervised learning alone. Our models also transfer to CNN/DM news articles, producing summaries nearly as good as the human reference without any news-specific fine-tuning. We conduct extensive analyses to understand our human feedback dataset and fine-tuned models We establish that our reward model generalizes to new datasets, and that optimizing our reward model results in better summaries than optimizing ROUGE according to humans. We hope the evidence from our paper motivates machine learning researchers to pay closer attention to how their training loss affects the model behavior they actually want.

翻译：随着语言模型变得更加强大,培训和评价日益受到用于特定任务的数据和衡量标准的限制。例如,总化模型往往经过培训,以预测人类参考摘要,并使用ROUGE进行评估,但这两种指标都是我们真正关心的 -- -- 概要质量 -- -- 的粗略替代物。在这项工作中,我们表明,通过培训一个模型,优化人类偏好,可以大幅提高概要质量。我们收集大量高质量的人类摘要比较数据,培训一个模型,以预测人类首选摘要,并利用该模型作为奖励功能,用强化学习来微调概括政策。我们运用了我们的方法,对TL的版本进行了精美化;Reddit 站的数据集,发现我们的模型大大超出我们真正关心的人类参考摘要和大得多的模型,仅靠监督学习来改进。我们的模型还转至CNN/DM新闻文章,制作摘要几乎和人类参考模型一样好,不需要任何针对具体新闻的微调。我们进行广泛的分析,以了解人类反馈数据集成和微调模型,以便用强化模型来微调政策。我们用我们的方法来对TLLD数据集进行精细的模型进行微的学习。我们要根据更精确的模型来激励,我们更精确的模型,我们更精确地将改进了我们的模型,让我们的学习新的数据库到更精确地评估。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

【芝加哥大学】GRAPH-BERT: Only Attention is Needed for Learning Graph Representations

专知会员服务

85+阅读 · 2020年1月15日