公平导向的少样本提示对大型语言模型的影响 (Fairness-guided Few-shot Prompting for Large Language Models) - 专知论文

会员服务 ·

0

Performer · Prompt · 语言模型化 · Learning · MoDELS ·

2023 年 3 月 23 日

Fairness-guided Few-shot Prompting for Large Language Models

翻译：公平导向的少样本提示对大型语言模型的影响

Huan Ma,Changqing Zhang,Yatao Bian,Lemao Liu,Zhirui Zhang,Peilin Zhao,Shu Zhang,Huazhu Fu,Qinghua Hu,Bingzhe Wu

Large language models have demonstrated surprising ability to perform in-context learning, i.e., these models can be directly applied to solve numerous downstream tasks by conditioning on a prompt constructed by a few input-output examples. However, prior research has shown that in-context learning can suffer from high instability due to variations in training examples, example order, and prompt formats. Therefore, the construction of an appropriate prompt is essential for improving the performance of in-context learning. In this paper, we revisit this problem from the view of predictive bias. Specifically, we introduce a metric to evaluate the predictive bias of a fixed prompt against labels or a given attributes. Then we empirically show that prompts with higher bias always lead to unsatisfactory predictive quality. Based on this observation, we propose a novel search strategy based on the greedy search to identify the near-optimal prompt for improving the performance of in-context learning. We perform comprehensive experiments with state-of-the-art mainstream models such as GPT-3 on various downstream tasks. Our results indicate that our method can enhance the model's in-context learning performance in an effective and interpretable manner.

翻译：大型语言模型已经展示了惊人的现场学习能力, 也就是这些模型可以通过少量的输入输出示例构造出prompt，用于解决众多下游任务。然而，先前的研究表明，现场学习可能由于训练示例的变化、示例顺序和提示格式而产生高度的不稳定性。因此，构造适当的提示对于提高现场学习的性能至关重要。本文从预测偏差的角度重新考虑这个问题。具体而言，我们引入了一个度量来评估固定提示相对于标签或特定属性的预测偏差。然后我们经验证明，具有更高偏差的提示总是导致不令人满意的预测质量。基于这一观察结果，我们提出了一种新的搜索策略，使用贪心搜索来识别接近最优的提示，从而提高现场学习的性能。我们使用GPT-3这样的主流模型在各种下游任务上进行了全面的实验。结果表明，我们的方法可以以有效且可解释的方式增强模型的现场学习性能。

0

相关内容

Performer

【吴恩达新课程】ChatGPT提示工程，ChatGPT Prompt Engineering for Developers

【吴恩达新课程】ChatGPT提示工程，ChatGPT Prompt Engineering for Developers

专知会员服务

104+阅读 · 2023年4月28日

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【MIT-ICLR2022】在机器学习模型中注入公平性, Injecting fairness into machine-learning models

【MIT-ICLR2022】在机器学习模型中注入公平性, Injecting fairness into machine-learning models

专知会员服务

22+阅读 · 2022年3月7日

【文本生成现代方法】Modern Methods for Text Generation

【文本生成现代方法】Modern Methods for Text Generation

专知会员服务

44+阅读 · 2020年9月11日

【ICML2020】统一预训练伪掩码语言模型

【ICML2020】统一预训练伪掩码语言模型

专知会员服务

27+阅读 · 2020年7月23日

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

专知会员服务

73+阅读 · 2020年5月30日

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

专知会员服务

20+阅读 · 2020年5月3日

【异构图迁移的零样本学习】Heterogeneous Graph-based Knowledge Transfer for Generalized Zero-shot Learning

【异构图迁移的零样本学习】Heterogeneous Graph-based Knowledge Transfer for Generalized Zero-shot Learning

专知会员服务

66+阅读 · 2020年4月17日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

11+阅读 · 2019年12月28日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

NAACL 2022 | 基于Prompt的文本生成迁移学习

NAACL 2022 | 基于Prompt的文本生成迁移学习

PaperWeekly

1+阅读 · 2022年8月31日

ACL 2022 | 基于Prompt的自动去偏：有效减轻预训练语言模型中的偏见

ACL 2022 | 基于Prompt的自动去偏：有效减轻预训练语言模型中的偏见

PaperWeekly

0+阅读 · 2022年7月14日

ACL‘22杰出论文：Prompt范式有bug！

ACL‘22杰出论文：Prompt范式有bug！

夕小瑶的卖萌屋

2+阅读 · 2022年7月10日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Gal-3在AGEs介导单核巨噬细胞生物学行为中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

非酯化脂肪酸对隐性乳腺炎奶牛中性粒细胞跨内皮迁移的影响机制

国家自然科学基金

0+阅读 · 2014年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

DGKε/SNARE信号通路在糖尿病肾病足细胞胰岛素抵抗中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

近断层地震下基于PFD-LRB的半主动隔震系统抗震性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

MicroRNA-17对血管损伤后内皮修复的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

p62/SQSTM1 在乳腺癌侵袭、转移中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

Gata6对血管损伤修复和动脉粥样硬化形成的作用及其机制

国家自然科学基金

0+阅读 · 2011年12月31日

Sema4C对斑马鱼血管和淋巴管生成的调控机制及其在肿瘤转移中的功能研究

国家自然科学基金

0+阅读 · 2011年12月31日

DMT1介导血脑屏障铅转运的正反馈机制及铁的拮抗作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

Evaluating Open-Domain Question Answering in the Era of Large Language Models

Arxiv

0+阅读 · 2023年5月14日

Diffusion Models for Imperceptible and Transferable Adversarial Attack

Arxiv

0+阅读 · 2023年5月14日

Prompt Learning to Mitigate Catastrophic Forgetting in Cross-lingual Transfer for Open-domain Dialogue Generation

Arxiv

0+阅读 · 2023年5月12日

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

Arxiv

5+阅读 · 2023年5月12日

Privacy-Preserving Prompt Tuning for Large Language Model Services

Arxiv

0+阅读 · 2023年5月10日

ToolCoder: Teach Code Generation Models to use API search tools

Arxiv

0+阅读 · 2023年5月9日

Meta Learning for Natural Language Processing: A Survey

Meta Learning for Natural Language Processing: A Survey

Arxiv

14+阅读 · 2022年5月3日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

【吴恩达新课程】ChatGPT提示工程，ChatGPT Prompt Engineering for Developers

【吴恩达新课程】ChatGPT提示工程，ChatGPT Prompt Engineering for Developers

专知会员服务

104+阅读 · 2023年4月28日

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

【Hugging Face】使用自定义数据集微调语义分割模型，Fine-Tune a Semantic Segmentation Model with a Custom Dataset

专知会员服务

21+阅读 · 2022年3月18日

【MIT-ICLR2022】在机器学习模型中注入公平性, Injecting fairness into machine-learning models

【MIT-ICLR2022】在机器学习模型中注入公平性, Injecting fairness into machine-learning models

专知会员服务

22+阅读 · 2022年3月7日

【文本生成现代方法】Modern Methods for Text Generation

【文本生成现代方法】Modern Methods for Text Generation

专知会员服务

44+阅读 · 2020年9月11日

【ICML2020】统一预训练伪掩码语言模型

【ICML2020】统一预训练伪掩码语言模型

专知会员服务

27+阅读 · 2020年7月23日

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

1750亿参数！GPT-3来了！31位作者，OpenAI发布小样本学习器语言模型

专知会员服务

73+阅读 · 2020年5月30日

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

【斯坦福】探究预训练语言模型中的可迁移性，Investigating Transferability in PLM

专知会员服务

20+阅读 · 2020年5月3日

【异构图迁移的零样本学习】Heterogeneous Graph-based Knowledge Transfer for Generalized Zero-shot Learning

【异构图迁移的零样本学习】Heterogeneous Graph-based Knowledge Transfer for Generalized Zero-shot Learning

专知会员服务

66+阅读 · 2020年4月17日

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

【ICLR2020】理解非自回归机器翻译中的知识蒸馏（Understanding Knowledge Distillation in Non-autoregressive Machine Translation）

专知会员服务

11+阅读 · 2019年12月28日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

NAACL 2022 | 基于Prompt的文本生成迁移学习

NAACL 2022 | 基于Prompt的文本生成迁移学习

PaperWeekly

1+阅读 · 2022年8月31日

ACL 2022 | 基于Prompt的自动去偏：有效减轻预训练语言模型中的偏见

ACL 2022 | 基于Prompt的自动去偏：有效减轻预训练语言模型中的偏见

PaperWeekly

0+阅读 · 2022年7月14日

ACL‘22杰出论文：Prompt范式有bug！

ACL‘22杰出论文：Prompt范式有bug！

夕小瑶的卖萌屋

2+阅读 · 2022年7月10日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Evaluating Open-Domain Question Answering in the Era of Large Language Models

Arxiv

0+阅读 · 2023年5月14日

Diffusion Models for Imperceptible and Transferable Adversarial Attack

Arxiv

0+阅读 · 2023年5月14日

Prompt Learning to Mitigate Catastrophic Forgetting in Cross-lingual Transfer for Open-domain Dialogue Generation

Arxiv

0+阅读 · 2023年5月12日

Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning

Arxiv

5+阅读 · 2023年5月12日

Privacy-Preserving Prompt Tuning for Large Language Model Services

Arxiv

0+阅读 · 2023年5月10日

ToolCoder: Teach Code Generation Models to use API search tools

Arxiv

0+阅读 · 2023年5月9日

Meta Learning for Natural Language Processing: A Survey

Meta Learning for Natural Language Processing: A Survey

Arxiv

14+阅读 · 2022年5月3日

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Arxiv

30+阅读 · 2021年7月28日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Making Pre-trained Language Models Better Few-shot Learners

Arxiv

14+阅读 · 2020年12月31日

相关基金

Gal-3在AGEs介导单核巨噬细胞生物学行为中的作用及机制

国家自然科学基金

0+阅读 · 2015年12月31日

非酯化脂肪酸对隐性乳腺炎奶牛中性粒细胞跨内皮迁移的影响机制

国家自然科学基金

0+阅读 · 2014年12月31日

PPAR β/δ基因在结直肠癌血管生成调控中的作用及分子机理

国家自然科学基金

2+阅读 · 2014年12月31日

DGKε/SNARE信号通路在糖尿病肾病足细胞胰岛素抵抗中的作用及机制

国家自然科学基金

0+阅读 · 2013年12月31日

近断层地震下基于PFD-LRB的半主动隔震系统抗震性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

MicroRNA-17对血管损伤后内皮修复的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

p62/SQSTM1 在乳腺癌侵袭、转移中的作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

Gata6对血管损伤修复和动脉粥样硬化形成的作用及其机制

国家自然科学基金

0+阅读 · 2011年12月31日

Sema4C对斑马鱼血管和淋巴管生成的调控机制及其在肿瘤转移中的功能研究

国家自然科学基金

0+阅读 · 2011年12月31日

DMT1介导血脑屏障铅转运的正反馈机制及铁的拮抗作用研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员