Fine-tuning a language model on a new domain is standard practice for domain adaptation. However, it can be infeasible when it comes to modern large-scale language models such as GPT-3, which can only be accessed through APIs, making it difficult to access the internal parameters of the model. In this paper, we propose $k$NN-Adapter, a method to effectively adapt these black-box large language models (LLMs) to a new domain. The $k$NN-Adapter builds on top of the retrieval-augmented language model, and adaptively learns to interpolate the output of the language model with retrieval results from a datastore consisting of the target domain data. Our experiments on four different domains demonstrate that $k$NN-Adapter significantly improves perplexity, and works particularly well in settings with limited access to LLMs. Additionally, we show that $k$NN-Adapter is more effective than fine-tuning when the amount of training data is limited. We also release a dataset to encourage further study.
翻译:在新域上微调语言模型是适应域域的标准做法,但是,当涉及诸如GPT-3等现代大型语言模型时,它可能不可行,因为只有通过APIs才能进入,因此难以进入该模型的内部参数。在本文中,我们提议用$k$NN-Adapter来有效地使这些黑盒大型语言模型(LLMs)适应新域的方法。$k$NNN-Adapter在检索强化语言模型的顶端建立起来,并适应性地学习将语言模型的输出与由目标域数据组成的数据存储站的检索结果相互调和。我们在四个不同领域的实验表明,$k$NNNN-Adapter显著地提高了人们的困惑性,在有限使用LMs的情况下特别行之有效。此外,我们表明,当培训数据数量有限时,$k$NNNN-Adapter比微调更有效。我们还发布一个数据集以鼓励进一步研究。