学习通过强化学习生成生成对话的提示 (Learning to Generate Prompts for Dialogue Generation through Reinforcement Learning) - 专知论文

会员服务 ·

0

Learning · MoDELS · 任务对话系统 · Prompt · 语言模型化 ·

2022 年 10 月 13 日

Learning to Generate Prompts for Dialogue Generation through Reinforcement Learning

翻译：学习通过强化学习生成生成对话的提示

Hsuan Su,Pohan Chi,Shih-Cheng Huang,Chung Ho Lam,Saurav Sahay,Shang-Tse Chen,Hung-yi Lee

Much literature has shown that prompt-based learning is an efficient method to make use of the large pre-trained language model. Recent works also exhibit the possibility of steering a chatbot's output by plugging in an appropriate prompt. Gradient-based methods are often used to perturb the prompts. However, some language models are not even available to the public. In this work, we first explored the combination of prompting and reinforcement learning (RL) to steer models' generation without accessing any of the models' parameters. Second, to reduce the training effort and enhance the generalizability to the unseen task, we apply multi-task learning to make the model learn to generalize to new tasks better. The experiment results show that our proposed method can successfully control several state-of-the-art (SOTA) dialogue models without accessing their parameters. Furthermore, the model demonstrates the strong ability to quickly adapt to an unseen task in fewer steps than the baseline model.

翻译：大量文献表明,速成学习是使用大型预先培训语言模式的有效方法。最近的工作还展示了通过适当快速插插来引导聊天机器人输出的可能性。以渐进为基础的方法常常用来干扰快速。但是,有些语言模式甚至无法为公众所使用。在这项工作中,我们首先探讨了如何结合快速强化学习(RL)来引导模型的生成,而没有获得任何模型参数。其次,为了减少培训努力,提高对不可见任务的普遍性,我们应用多任务学习,使模型学会如何更好地概括新任务。实验结果表明,我们提出的方法可以成功地控制好几种最先进的对话模式,而没有获得参数。此外,模型表明,在比基线模型更少的步骤下快速适应一项不可见的任务的能力很强。

0

相关内容

Learning

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

《数学学报》期刊

国家自然科学基金

5+阅读 · 2015年12月31日

几类含∞-Laplace算子的特征值问题的研究

国家自然科学基金

1+阅读 · 2015年12月31日

多模协同增强NaREF4(RE=Y,Gd)上转换发光纳米材料的发光效率

国家自然科学基金

0+阅读 · 2015年12月31日

氨基酸席夫碱辅助配体调控蓝光铱配合物发光蓝移的研究

国家自然科学基金

0+阅读 · 2014年12月31日

金属光子晶体增强硅基LED发光关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

三维高分辨率各向异性Radon变换及其应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

航空用蠕变时效成形高强铝合金疲劳特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于纳米多孔硅电子源的无放电气体激发发光研究

国家自然科学基金

0+阅读 · 2012年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

大直径直拉硅单晶缺陷工程的基础研究

国家自然科学基金

0+阅读 · 2008年12月31日

Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations

Arxiv

0+阅读 · 2022年11月19日

Lyapunov Design for Robust and Efficient Robotic Reinforcement Learning

Arxiv

0+阅读 · 2022年11月18日

Learning to Communicate with Intent: An Introduction

Arxiv

0+阅读 · 2022年11月17日

A Reinforcement Learning Approach for Process Parameter Optimization in Additive Manufacturing

Arxiv

0+阅读 · 2022年11月17日

Quark: Controllable Text Generation with Reinforced Unlearning

Arxiv

0+阅读 · 2022年11月16日

On the Compositional Generalization Gap of In-Context Learning

Arxiv

1+阅读 · 2022年11月15日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Reinforcement Learning based Air Combat Maneuver Generation

Reinforcement Learning based Air Combat Maneuver Generation

Arxiv

91+阅读 · 2022年1月14日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

VIP会员

文章信息

相关主题

任务对话系统

语言模型化

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】迈向鲁棒的零样本强化学习

一种基于视觉算法生成三维场景重建的多任务系统 | 2025最新200页

【普林斯顿博士论文】量化、评估与缓解现代机器学习系统中的风险

遥感中基于深度学习的领域自适应方法：全面综述

相关资讯

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Fast Lifelong Adaptive Inverse Reinforcement Learning from Demonstrations

Arxiv

0+阅读 · 2022年11月19日

Lyapunov Design for Robust and Efficient Robotic Reinforcement Learning

Arxiv

0+阅读 · 2022年11月18日

Learning to Communicate with Intent: An Introduction

Arxiv

0+阅读 · 2022年11月17日

A Reinforcement Learning Approach for Process Parameter Optimization in Additive Manufacturing

Arxiv

0+阅读 · 2022年11月17日

Quark: Controllable Text Generation with Reinforced Unlearning

Arxiv

0+阅读 · 2022年11月16日

On the Compositional Generalization Gap of In-Context Learning

Arxiv

1+阅读 · 2022年11月15日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Reinforcement Learning based Air Combat Maneuver Generation

Reinforcement Learning based Air Combat Maneuver Generation

Arxiv

91+阅读 · 2022年1月14日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

相关基金

《数学学报》期刊

国家自然科学基金

5+阅读 · 2015年12月31日

几类含∞-Laplace算子的特征值问题的研究

国家自然科学基金

1+阅读 · 2015年12月31日

多模协同增强NaREF4(RE=Y,Gd)上转换发光纳米材料的发光效率

国家自然科学基金

0+阅读 · 2015年12月31日

氨基酸席夫碱辅助配体调控蓝光铱配合物发光蓝移的研究

国家自然科学基金

0+阅读 · 2014年12月31日

金属光子晶体增强硅基LED发光关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

三维高分辨率各向异性Radon变换及其应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

航空用蠕变时效成形高强铝合金疲劳特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于纳米多孔硅电子源的无放电气体激发发光研究

国家自然科学基金

0+阅读 · 2012年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

大直径直拉硅单晶缺陷工程的基础研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员