将语言模式作为培训数据生成器,作为增强-加强少热学习的培训数据生成器 (Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning)

Recent studies have revealed the intriguing few-shot learning ability of pretrained language models (PLMs): They can quickly adapt to a new task when fine-tuned on a small amount of labeled data formulated as prompts, without requiring abundant task-specific annotations. Despite their promising performance, most existing few-shot approaches that only learn from the small training set still underperform fully supervised training by nontrivial margins. In this work, we study few-shot learning with PLMs from a different perspective: We first tune an autoregressive PLM on the few-shot samples and then use it as a generator to synthesize a large amount of novel training samples which augment the original training set. To encourage the generator to produce label-discriminative samples, we train it via weighted maximum likelihood where the weight of each token is automatically adjusted based on a discriminative meta-learning objective. A classification PLM can then be fine-tuned on both the few-shot and the synthetic samples with regularization for better generalization and stability. Our approach FewGen achieves an overall better result across seven classification tasks of the GLUE benchmark than existing few-shot learning methods, improving no-augmentation methods by 5+ average points, and outperforming augmentation methods by 3+ average points.

翻译：最近的研究揭示了预先培训的语言模型(PLM)令人感兴趣的少见的学习能力:在微小的标签数据被微调成提示性数据时,它们可以迅速适应新的任务,而不需要大量具体任务说明。尽管它们表现良好,但大多数现有的微小方法只从仍然未通过非边际途径充分监督的小规模培训中学习。在这项工作中,我们从不同的角度研究与PLM一起的微小的微小学习能力:我们首先在少数的样品上调整一个自动递减式的PLM,然后用它作为生成器,合成大量新的培训样本,以补充原有的训练数据集。为了鼓励发电机制作有标签差异的样本,我们通过加权最大可能的方式培训它,即根据歧视性的元学习目标自动调整每个符号的重量。然后,对微小的和合成样本进行微调,使之正规化,以更好地概括和稳定。我们的方法小GLUE基准的七个分类任务的总体效果优于现有的微分数分级和增分级方法。

相关内容

小样本学习

关注 215

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日