Aligning Language Models with Preferences through f-divergence Minimization - 专知论文

会员服务 ·

0

散度 · 语言模型化 · 近似 · MoDELS · 前向 ·

2023 年 6 月 6 日

Aligning Language Models with Preferences through f-divergence Minimization

翻译：暂无翻译

Dongyoung Go,Tomasz Korbak,Germán Kruszewski,Jos Rozen,Nahyeon Ryu,Marc Dymetman

Aligning language models with preferences can be posed as approximating a target distribution representing some desired behavior. Existing approaches differ both in the functional form of the target distribution and the algorithm used to approximate it. For instance, Reinforcement Learning from Human Feedback (RLHF) corresponds to minimizing a reverse KL from an implicit target distribution arising from a KL penalty in the objective. On the other hand, Generative Distributional Control (GDC) has an explicit target distribution and minimizes a forward KL from it using the Distributional Policy Gradient (DPG) algorithm. In this paper, we propose a new approach, f-DPG, which allows the use of any f-divergence to approximate any target distribution that can be evaluated. f-DPG unifies both frameworks (RLHF, GDC) and the approximation methods (DPG, RL with KL penalties). We show the practical benefits of various choices of divergence objectives and demonstrate that there is no universally optimal objective but that different divergences present different alignment and diversity trade-offs. We show that Jensen-Shannon divergence strikes a good balance between these objectives, and frequently outperforms forward KL divergence by a wide margin, leading to significant improvements over prior work. These distinguishing characteristics between divergences persist as the model size increases, highlighting the importance of selecting appropriate divergence objectives.

翻译：暂无翻译

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

长链非编码RNA TUG1调控牙周膜干细胞成骨分化及组织再生的研究

国家自然科学基金

0+阅读 · 2014年12月31日

中能重离子碰撞中的自旋动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

纤毛杆影响嵌合型腺病毒感染T淋巴细胞效率的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

A位缺陷及B位掺杂对La掺杂的SrTiO3固体氧化物燃料电池阳极材料性能的影响机理

国家自然科学基金

0+阅读 · 2013年12月31日

软磁性金属玻璃的重离子辐照效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

异原子过渡金属和磷共掺杂碳的制备及其对氧气还原催化机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

光遗传学研究基底前脑胆碱能神经元在睡眠觉醒中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

广义螺旋曲面的智能化STEP-NC加工基础

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

乙肝病毒表面抗原之优势性Treg表位鉴定及功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

Matching Patients to Clinical Trials with Large Language Models

Arxiv

0+阅读 · 2023年7月28日

Settling the Score: Portioning with Cardinal Preferences

Arxiv

0+阅读 · 2023年7月28日

f-Divergence Minimization for Sequence-Level Knowledge Distillation

Arxiv

0+阅读 · 2023年7月27日

Large Language Models are Competitive Near Cold-start Recommenders for Language- and Item-based Preferences

Arxiv

0+阅读 · 2023年7月26日

Leveraging Large Language Models for Mental Health Prediction via Online Text Data

Arxiv

0+阅读 · 2023年7月26日

Understanding Diffusion Models: A Unified Perspective

Arxiv

14+阅读 · 2022年8月25日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

Re-ID done right: towards good practices for person re-identification

Arxiv

14+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体化人工智能：架构、应用及未来发展方向的综合综述

《自主武器》365页书籍

联邦学习综述：多层次聚合技术的系统分类、实验洞察与未来前沿

人工智能在空战中的局限及其真正适用领域

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Matching Patients to Clinical Trials with Large Language Models

Arxiv

0+阅读 · 2023年7月28日

Settling the Score: Portioning with Cardinal Preferences

Arxiv

0+阅读 · 2023年7月28日

f-Divergence Minimization for Sequence-Level Knowledge Distillation

Arxiv

0+阅读 · 2023年7月27日

Large Language Models are Competitive Near Cold-start Recommenders for Language- and Item-based Preferences

Arxiv

0+阅读 · 2023年7月26日

Leveraging Large Language Models for Mental Health Prediction via Online Text Data

Arxiv

0+阅读 · 2023年7月26日

Understanding Diffusion Models: A Unified Perspective

Arxiv

14+阅读 · 2022年8月25日

Active Learning for Domain Adaptation: An Energy-based Approach

Arxiv

13+阅读 · 2021年12月2日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

Re-ID done right: towards good practices for person re-identification

Arxiv

14+阅读 · 2018年1月16日

相关基金

长链非编码RNA TUG1调控牙周膜干细胞成骨分化及组织再生的研究

国家自然科学基金

0+阅读 · 2014年12月31日

中能重离子碰撞中的自旋动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

纤毛杆影响嵌合型腺病毒感染T淋巴细胞效率的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

A位缺陷及B位掺杂对La掺杂的SrTiO3固体氧化物燃料电池阳极材料性能的影响机理

国家自然科学基金

0+阅读 · 2013年12月31日

软磁性金属玻璃的重离子辐照效应研究

国家自然科学基金

0+阅读 · 2012年12月31日

异原子过渡金属和磷共掺杂碳的制备及其对氧气还原催化机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

光遗传学研究基底前脑胆碱能神经元在睡眠觉醒中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

广义螺旋曲面的智能化STEP-NC加工基础

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

乙肝病毒表面抗原之优势性Treg表位鉴定及功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员