重温k-NN用于预训练语言模型 (Revisiting k-NN for Pre-trained Language Models)

Pre-trained Language Models (PLMs), as parametric-based eager learners, have become the de-facto choice for current paradigms of Natural Language Processing (NLP). In contrast, k-Nearest-Neighbor (k-NN) classifiers, as the lazy learning paradigm, tend to mitigate over-fitting and isolated noise. In this paper, we revisit k-NN classifiers for augmenting the PLMs-based classifiers. From the methodological level, we propose to adopt k-NN with textual representations of PLMs in two steps: (1) Utilize k-NN as prior knowledge to calibrate the training process. (2) Linearly interpolate the probability distribution predicted by k-NN with that of the PLMs' classifier. At the heart of our approach is the implementation of k-NN-calibrated training, which treats predicted results as indicators for easy versus hard examples during the training process. From the perspective of the diversity of application scenarios, we conduct extensive experiments on fine-tuning, prompt-tuning paradigms and zero-shot, few-shot and fully-supervised settings, respectively, across eight diverse end-tasks. We hope our exploration will encourage the community to revisit the power of classical methods for efficient NLP\footnote{Code and datasets are available in https://github.com/zjunlp/Revisit-KNN.

翻译：摘要：预训练语言模型（PLMs）作为参数化的急切学习者，已成为当前自然语言处理（NLP)范例的选择。相反，k-Nearest-Neighbor（k-NN）分类器作为惰性学习范式，倾向于缓解过度拟合和孤立噪声。在本文中，我们重新审视k-NN分类器，以扩充基于PLMs的分类器。从方法论层面上，我们建议在两个步骤中使用k-NN及PLMs的文本表示: (1) 利用k-NN作为先验知识来校准训练过程, (2) 线性插值k-NN预测的概率分布和PLMs分类器预测的概率分布。我们方法的核心是实现了k-NN校准训练，该训练过程将预测结果用作训练过程中的易/难样本指标。从应用场景多样性的角度，我们在fine-tuning, prompt-tuning范例和zero-shot, few-shot, fully-supervised setting模式下，对八个不同的最终任务进行了广泛的实验。我们希望我们的探索将鼓励社区重新审视经典方法，以实现高效的NLP。（代码和数据集可在https://github.com/zjunlp/Revisit-KNN中获得。）

相关内容

分类器

关注 6

分类是数据挖掘的一种非常重要的方法。分类的概念是在已有数据的基础上学会一个分类函数或构造出一个分类模型（即我们通常所说的分类器(Classifier)）。该函数或模型能够把数据库中的数据纪录映射到给定类别中的某一个，从而可以应用于数据预测。总之，分类器是数据挖掘中对样本进行分类的方法的统称，包含决策树、逻辑回归、朴素贝叶斯、神经网络等算法。

【AAAI2022】基于对比学习的预训练语言模型剪枝压缩

专知会员服务

29+阅读 · 2022年1月24日

基于预训练语言模型的文本生成研究综述

专知会员服务

82+阅读 · 2021年10月15日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【ACL2021】预训练语言模型的少样本知识图谱文本生成

专知会员服务

39+阅读 · 2021年6月6日