基于公平性的大语言模型少样本提示 (Fairness-guided Few-shot Prompting for Large Language Models) - 专知论文

会员服务 ·

0

上下文学习 · 上下文 · 偏差 · 示例 · 语言模型 ·

2023 年 3 月 25 日

Fairness-guided Few-shot Prompting for Large Language Models

翻译：基于公平性的大语言模型少样本提示

Huan Ma,Changqing Zhang,Yatao Bian,Lemao Liu,Zhirui Zhang,Peilin Zhao,Shu Zhang,Huazhu Fu,Qinghua Hu,Bingzhe Wu

Large language models have demonstrated surprising ability to perform in-context learning, i.e., these models can be directly applied to solve numerous downstream tasks by conditioning on a prompt constructed by a few input-output examples. However, prior research has shown that in-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats. Therefore, the construction of an appropriate prompt is essential for improving the performance of in-context learning. In this paper, we revisit this problem from the view of predictive bias. Specifically, we introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes. Then we empirically show that prompts with higher bias always lead to unsatisfactory predictive quality. Based on this observation, we propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning. We perform comprehensive experiments with state-of-the-art mainstream models such as GPT-3 on various downstream tasks. Our results indicate that our method can enhance the model's in-context learning performance in an effective and interpretable manner.

翻译：大语言模型表现出惊人的能力，可以通过构建由少量输入-输出示例组成的提示来直接解决许多下游任务中的上下文学习问题。然而，先前的研究表明，由于训练示例、示例顺序和提示格式的变化导致的不稳定性问题，从而导致上下文学习表现不佳，因此，构建适当的提示对于提高上下文学习的性能至关重要。在本文中，我们从预测偏差的角度重新审视了此问题。具体地，我们引入了一个指标来评估固定提示相对于标签或特定属性的预测偏差。然后，我们经验性地展示，具有较高偏差的提示总是导致令人不满意的预测质量。基于这一观察结果，我们提出了一种基于贪婪搜索的新的搜索策略，以识别接近最优的提示，以改善上下文学习的性能。我们使用最新的主流模型，如GPT-3，在各种下游任务上进行了全面的实验。我们的结果表明，我们的方法可以以有效且可解释的方式增强模型的上下文学习性能。

0

相关内容

上下文学习

上下文学习

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

专知会员服务

68+阅读 · 2022年3月20日

基于大型预训练语言模型的自然语言处理研究进展综述

基于大型预训练语言模型的自然语言处理研究进展综述

专知会员服务

96+阅读 · 2021年11月4日

知识增强预训练语言模型:全面综述

知识增强预训练语言模型:全面综述

专知会员服务

93+阅读 · 2021年10月19日

NLP新范式-预训练，提示(Prompt)，预测！CMU刘鹏飞等论文综述预训练语言模型提示学习进展

NLP新范式-预训练，提示(Prompt)，预测！CMU刘鹏飞等论文综述预训练语言模型提示学习进展

专知会员服务

71+阅读 · 2021年7月31日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【NeurlPS2019论文总结】一致收敛可能无法解释深度学习中的泛化现象，Uniform convergence may be unable to explain generalization in deep learning

【NeurlPS2019论文总结】一致收敛可能无法解释深度学习中的泛化现象，Uniform convergence may be unable to explain generalization in deep learning

专知会员服务

15+阅读 · 2019年12月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

ICLR 2023 | PromptPG：当强化学习遇见大规模语言模型

ICLR 2023 | PromptPG：当强化学习遇见大规模语言模型

PaperWeekly

0+阅读 · 2023年4月7日

论文浅尝 | Language Models (Mostly) Know What They Know

论文浅尝 | Language Models (Mostly) Know What They Know

开放知识图谱

2+阅读 · 2022年11月18日

NAACL 2022 | 基于Prompt的文本生成迁移学习

NAACL 2022 | 基于Prompt的文本生成迁移学习

PaperWeekly

1+阅读 · 2022年8月31日

ACL‘22杰出论文：Prompt范式有bug！

ACL‘22杰出论文：Prompt范式有bug！

夕小瑶的卖萌屋

2+阅读 · 2022年7月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

专知

12+阅读 · 2018年4月26日

高维回归模型的预测稳定性研究

国家自然科学基金

3+阅读 · 2015年12月31日

复杂数据下含指标项半参数模型结构的统计推断及应用

国家自然科学基金

0+阅读 · 2014年12月31日

整合高维基因数据和临床特征的生存预后模型构建

国家自然科学基金

0+阅读 · 2014年12月31日

miR-125a-5p调控BRMS1基因表达在胃癌侵袭转移机制中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

肝癌组织特异性MGMT干扰腺病毒载体促进奥沙利铂杀伤肝癌细胞的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Exemplar-Classifier思想的高分辨率光学遥感影像目标识别研究

国家自然科学基金

2+阅读 · 2013年12月31日

NF-κB信号通路调控溶酶体相关4次跨膜蛋白质B (LAPTM4B)促人肝细胞癌增殖作用的研究

国家自然科学基金

0+阅读 · 2013年12月31日

miR-340/c-Met通过下调MMP-9表达缓解肝脏缺血再灌注损伤的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Z干扰信道的容量问题

国家自然科学基金

0+阅读 · 2012年12月31日

趋化因子诱骗受体DARC通过清除微环境中CCL28抑制MSL型三阴性乳腺癌增殖侵袭的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Self-Prompting Large Language Models for Zero-Shot Open-Domain QA

Arxiv

0+阅读 · 2023年5月16日

Small Models are Valuable Plug-ins for Large Language Models

Arxiv

0+阅读 · 2023年5月15日

Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers

Arxiv

0+阅读 · 2023年5月15日

Debiasing Vision-Language Models via Biased Prompts

Arxiv

0+阅读 · 2023年5月15日

Evaluating Open-Domain Question Answering in the Era of Large Language Models

Arxiv

0+阅读 · 2023年5月14日

ZARA: Improving Few-Shot Self-Rationalization for Small Language Models

Arxiv

0+阅读 · 2023年5月12日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

475+阅读 · 2023年3月31日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

VIP会员

文章信息

相关主题

上下文学习

相关VIP内容

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

专知会员服务

68+阅读 · 2022年3月20日

基于大型预训练语言模型的自然语言处理研究进展综述

基于大型预训练语言模型的自然语言处理研究进展综述

专知会员服务

96+阅读 · 2021年11月4日

知识增强预训练语言模型:全面综述

知识增强预训练语言模型:全面综述

专知会员服务

93+阅读 · 2021年10月19日

NLP新范式-预训练，提示(Prompt)，预测！CMU刘鹏飞等论文综述预训练语言模型提示学习进展

NLP新范式-预训练，提示(Prompt)，预测！CMU刘鹏飞等论文综述预训练语言模型提示学习进展

专知会员服务

71+阅读 · 2021年7月31日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

零样本文本分类，Zero-Shot Learning for Text Classification

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【NeurlPS2019论文总结】一致收敛可能无法解释深度学习中的泛化现象，Uniform convergence may be unable to explain generalization in deep learning

【NeurlPS2019论文总结】一致收敛可能无法解释深度学习中的泛化现象，Uniform convergence may be unable to explain generalization in deep learning

专知会员服务

15+阅读 · 2019年12月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

ICLR 2023 | PromptPG：当强化学习遇见大规模语言模型

ICLR 2023 | PromptPG：当强化学习遇见大规模语言模型

PaperWeekly

0+阅读 · 2023年4月7日

论文浅尝 | Language Models (Mostly) Know What They Know

论文浅尝 | Language Models (Mostly) Know What They Know

开放知识图谱

2+阅读 · 2022年11月18日

NAACL 2022 | 基于Prompt的文本生成迁移学习

NAACL 2022 | 基于Prompt的文本生成迁移学习

PaperWeekly

1+阅读 · 2022年8月31日

ACL‘22杰出论文：Prompt范式有bug！

ACL‘22杰出论文：Prompt范式有bug！

夕小瑶的卖萌屋

2+阅读 · 2022年7月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

专知

12+阅读 · 2018年4月26日

相关论文

Self-Prompting Large Language Models for Zero-Shot Open-Domain QA

Arxiv

0+阅读 · 2023年5月16日

Small Models are Valuable Plug-ins for Large Language Models

Arxiv

0+阅读 · 2023年5月15日

Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers

Arxiv

0+阅读 · 2023年5月15日

Debiasing Vision-Language Models via Biased Prompts

Arxiv

0+阅读 · 2023年5月15日

Evaluating Open-Domain Question Answering in the Era of Large Language Models

Arxiv

0+阅读 · 2023年5月14日

ZARA: Improving Few-Shot Self-Rationalization for Small Language Models

Arxiv

0+阅读 · 2023年5月12日

A Survey of Large Language Models

A Survey of Large Language Models

Arxiv

475+阅读 · 2023年3月31日

Towards Reasoning in Large Language Models: A Survey

Arxiv

34+阅读 · 2022年12月20日

Few-shot Natural Language Generation for Task-Oriented Dialog

Few-shot Natural Language Generation for Task-Oriented Dialog

Arxiv

30+阅读 · 2020年2月27日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

相关基金

高维回归模型的预测稳定性研究

国家自然科学基金

3+阅读 · 2015年12月31日

复杂数据下含指标项半参数模型结构的统计推断及应用

国家自然科学基金

0+阅读 · 2014年12月31日

整合高维基因数据和临床特征的生存预后模型构建

国家自然科学基金

0+阅读 · 2014年12月31日

miR-125a-5p调控BRMS1基因表达在胃癌侵袭转移机制中的作用研究

国家自然科学基金

0+阅读 · 2014年12月31日

肝癌组织特异性MGMT干扰腺病毒载体促进奥沙利铂杀伤肝癌细胞的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于Exemplar-Classifier思想的高分辨率光学遥感影像目标识别研究

国家自然科学基金

2+阅读 · 2013年12月31日

NF-κB信号通路调控溶酶体相关4次跨膜蛋白质B (LAPTM4B)促人肝细胞癌增殖作用的研究

国家自然科学基金

0+阅读 · 2013年12月31日

miR-340/c-Met通过下调MMP-9表达缓解肝脏缺血再灌注损伤的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Z干扰信道的容量问题

国家自然科学基金

0+阅读 · 2012年12月31日

趋化因子诱骗受体DARC通过清除微环境中CCL28抑制MSL型三阴性乳腺癌增殖侵袭的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员