大型语言模型的定向刺激引导 (Guiding Large Language Models via Directional Stimulus Prompting) - 专知论文

会员服务 ·

0

大型语言模型 · 语言模型 · 令牌 · 数据集 · 有效性 ·

2023 年 4 月 7 日

Guiding Large Language Models via Directional Stimulus Prompting

翻译：大型语言模型的定向刺激引导

Zekun Li,Baolin Peng,Pengcheng He,Michel Galley,Jianfeng Gao,Xifeng Yan

from arxiv, The code and data are available at https://github.com/Leezekun/Directional-Stimulus-Prompting

We introduce a new framework, Directional Stimulus Prompting, that uses a tuneable language model (LM) to provide guidance for the black-box frozen large language model (LLM) on downstream tasks. Unlike prior work that manually or automatically finds the optimal prompt for each task, we train a policy LM to generate discrete tokens as directional stimulus of each input, which is a hint/cue such as keywords of an article for summarization. The directional stimulus is then combined with the original input and fed into the LLM to guide its generation toward the desired target. The policy LM can be trained through 1) supervised learning from annotated data and 2) reinforcement learning from offline and online rewards to explore directional stimulus that better aligns LLMs with human preferences. This framework is flexibly applicable to various LMs and tasks. To verify its effectiveness, we apply our framework to summarization and dialogue response generation tasks. Experimental results demonstrate that it can significantly improve LLMs' performance with a small collection of training data: a T5 (780M) trained with 2,000 samples from the CNN/Daily Mail dataset improves Codex (175B)'s performance by 9.0% in ROUGE-Avg scores; only 80 dialogues can boost the combined score by 39.7%, achieving comparable or even better performance than some fully trained models on the MultiWOZ dataset. We have made our code publicly available.

翻译：我们引入了一种新的框架，名为“定向刺激引导”，它使用一种可调节的语言模型（LM）来为黑盒冻结大型语言模型（LLM）提供下游任务的指导。与先前的工作手动或自动找到每个任务的最佳提示不同，我们训练一个策略LM生成离散令牌作为每个输入的定向刺激，这是一个提示/线索，例如摘要的文章关键词。然后将定向刺激与原始输入组合并馈入LLM以指导其生成目标。策略LM可以通过1）从注释数据的监督学习和2）从离线和在线奖励的强化学习来训练，以探索更好地与人类偏好相一致的定向刺激。该框架可以灵活应用于各种LM和任务。为了验证其有效性，我们将我们的框架应用于摘要和对话响应生成任务。实验结果表明，它可以显着提高LLM在少量训练数据的性能：一个在CNN /每日邮报数据集上训练了2,000个样本的T5（780M）将Codex（175B）的性能提高了9.0％。只有80个对话可以将组合分数提高39.7％，在MultiWOZ数据集上实现了与一些完全训练的模型相当甚至更好的性能。我们已经公开了我们的代码。

0

相关内容

大型语言模型

大型语言模型

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【Hugging Face】指导文本生成与约束波束搜索🤗Transformers，Guiding Text Generation with Constrained Beam Search in 🤗 Transformers

【Hugging Face】指导文本生成与约束波束搜索🤗Transformers，Guiding Text Generation with Constrained Beam Search in 🤗 Transformers

专知会员服务

22+阅读 · 2022年3月18日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【GPT-3作者亲解】超大型语言模型少样本学习，109页ppt

【GPT-3作者亲解】超大型语言模型少样本学习，109页ppt

专知会员服务

109+阅读 · 2020年12月19日

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

专知会员服务

73+阅读 · 2020年5月30日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

预训练语言模型究竟捕获了什么？（oLMpics - On what Language Model Pre-training Captures）

预训练语言模型究竟捕获了什么？（oLMpics - On what Language Model Pre-training Captures）

专知会员服务

14+阅读 · 2020年1月3日

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

专知会员服务

18+阅读 · 2019年12月14日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

论文浅尝 | Language Models (Mostly) Know What They Know

论文浅尝 | Language Models (Mostly) Know What They Know

开放知识图谱

2+阅读 · 2022年11月18日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

GVHD炎性因子内环境对MSC功能可塑性影响的研究

国家自然科学基金

0+阅读 · 2015年12月31日

铌酸钾钠基无铅陶瓷介电、压电和应变响应非线性行为及机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

表观遗传和翻译后修饰机制对衰老海马突触蛋白的影响

国家自然科学基金

0+阅读 · 2012年12月31日

CuInGaSe2太阳能电池界面结构、界面态及其钝化

国家自然科学基金

0+阅读 · 2012年12月31日

Numbl-TRAF6-TAB2对NF-kappa B活性的调节在小胶质细胞炎性活化中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

小麦盐诱导基因TaSR的功能及耐盐机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

小黄鱼种群对黄海水域环境变化和人类活动的响应

国家自然科学基金

0+阅读 · 2009年12月31日

光动力疗法诱导成纤维细胞老化抑制瘢痕疙瘩形成的机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

ACAP4在胃酸分泌中的功能及调节分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

microRNA介导ADAR1抑制流感病毒复制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Fundamental Limitations of Alignment in Large Language Models

Arxiv

0+阅读 · 2023年5月29日

Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models

Arxiv

0+阅读 · 2023年5月29日

FuseCap: Leveraging Large Language Models to Fuse Visual Data into Enriched Image Captions

Arxiv

0+阅读 · 2023年5月28日

Zero-shot Visual Question Answering with Language Model Feedback

Arxiv

0+阅读 · 2023年5月26日

Reliable identification of selection mechanisms in language change

Arxiv

0+阅读 · 2023年5月25日

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Arxiv

22+阅读 · 2023年5月3日

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

Arxiv

31+阅读 · 2021年11月1日

Pix2seq: A Language Modeling Framework for Object Detection

Arxiv

10+阅读 · 2021年9月22日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

VIP会员

文章信息

相关主题

大型语言模型

相关VIP内容

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【Hugging Face】指导文本生成与约束波束搜索🤗Transformers，Guiding Text Generation with Constrained Beam Search in 🤗 Transformers

【Hugging Face】指导文本生成与约束波束搜索🤗Transformers，Guiding Text Generation with Constrained Beam Search in 🤗 Transformers

专知会员服务

22+阅读 · 2022年3月18日

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

【CVPR 2022】多模态视频字幕的端到端生成预训练，End-to-end Generative Pretraining for Multimodal Video Captioning

专知会员服务

27+阅读 · 2022年3月3日

【GPT-3作者亲解】超大型语言模型少样本学习，109页ppt

【GPT-3作者亲解】超大型语言模型少样本学习，109页ppt

专知会员服务

109+阅读 · 2020年12月19日

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

专知会员服务

73+阅读 · 2020年5月30日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

预训练语言模型究竟捕获了什么？（oLMpics - On what Language Model Pre-training Captures）

预训练语言模型究竟捕获了什么？（oLMpics - On what Language Model Pre-training Captures）

专知会员服务

14+阅读 · 2020年1月3日

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

【论文推荐】将机器语言模型扩展到人类级别的语言理解，Extending Machine Language Models toward Human-Level Language Understanding

专知会员服务

18+阅读 · 2019年12月14日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

论文浅尝 | Language Models (Mostly) Know What They Know

论文浅尝 | Language Models (Mostly) Know What They Know

开放知识图谱

2+阅读 · 2022年11月18日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

pytorch-pretrained-BERT：BERT PyTorch实现，可加载Google BERT预训练模型

AINLP

35+阅读 · 2018年11月6日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

相关论文

Fundamental Limitations of Alignment in Large Language Models

Arxiv

0+阅读 · 2023年5月29日

Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models

Arxiv

0+阅读 · 2023年5月29日

FuseCap: Leveraging Large Language Models to Fuse Visual Data into Enriched Image Captions

Arxiv

0+阅读 · 2023年5月28日

Zero-shot Visual Question Answering with Language Model Feedback

Arxiv

0+阅读 · 2023年5月26日

Reliable identification of selection mechanisms in language change

Arxiv

0+阅读 · 2023年5月25日

Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Arxiv

22+阅读 · 2023年5月3日

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

Arxiv

31+阅读 · 2021年11月1日

Pix2seq: A Language Modeling Framework for Object Detection

Arxiv

10+阅读 · 2021年9月22日

Data Augmentation using Pre-trained Transformer Models

Arxiv

17+阅读 · 2020年3月4日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Arxiv

15+阅读 · 2018年10月11日

相关基金

GVHD炎性因子内环境对MSC功能可塑性影响的研究

国家自然科学基金

0+阅读 · 2015年12月31日

铌酸钾钠基无铅陶瓷介电、压电和应变响应非线性行为及机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

表观遗传和翻译后修饰机制对衰老海马突触蛋白的影响

国家自然科学基金

0+阅读 · 2012年12月31日

CuInGaSe2太阳能电池界面结构、界面态及其钝化

国家自然科学基金

0+阅读 · 2012年12月31日

Numbl-TRAF6-TAB2对NF-kappa B活性的调节在小胶质细胞炎性活化中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

小麦盐诱导基因TaSR的功能及耐盐机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

小黄鱼种群对黄海水域环境变化和人类活动的响应

国家自然科学基金

0+阅读 · 2009年12月31日

光动力疗法诱导成纤维细胞老化抑制瘢痕疙瘩形成的机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

ACAP4在胃酸分泌中的功能及调节分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

microRNA介导ADAR1抑制流感病毒复制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员