The invention of transformer-based models such as BERT, GPT, and RoBERTa has enabled researchers and financial companies to finetune these powerful models and use them in different downstream tasks to achieve state-of-the-art performance. Recently, a lightweight alternative (approximately 0.1% - 3% of the original model parameters) to fine-tuning, known as prefix tuning has been introduced. This method freezes the model parameters and only updates the prefix to achieve performance comparable to full fine-tuning. Prefix tuning enables researchers and financial practitioners to achieve similar results with much fewer parameters. In this paper, we explore the robustness of prefix tuning when facing noisy data. Our experiments demonstrate that fine-tuning is more robust to noise than prefix tuning -- the latter method faces a significant decrease in performance on most corrupted data sets with increasing noise levels. Furthermore, prefix tuning has high variances in the F1 scores compared to fine-tuning in many corruption methods. We strongly advocate that caution should be carefully taken when applying the state-of-the-art prefix tuning method to noisy data.
翻译:以变压器为基础的模型(如BERT、GPT和ROBERTA)的发明使研究人员和金融公司能够对这些强大的模型进行微调,并在不同的下游任务中使用这些强大的模型,以达到最先进的性能。最近,引入了微调的轻量替代方法(约0.1%-原模型参数的3%),称为前缀调制。这种方法冻结了模型参数,并且只更新了前缀,以便达到与全面微调相近的性能。前缀调制使研究人员和金融从业者能够以少得多的参数取得类似的结果。在本文件中,我们探讨了在面对噪音数据时,前缀调制调整是否稳健。我们的实验表明,微调比前缀调更有力,后者随着噪音程度的提高,最腐败的数据集的性能将显著下降。此外,前缀调制比许多腐败方法的微调高。我们强烈主张,在应用最先进的前缀调制数据时,应谨慎谨慎行事。