Recent work indicated that pretrained language models (PLMs) such as BERT and RoBERTa can be transformed into effective sentence and word encoders even via simple self-supervised techniques. Inspired by this line of work, in this paper we propose a fully unsupervised approach to improving word-in-context (WiC) representations in PLMs, achieved via a simple and efficient WiC-targeted fine-tuning procedure: MirrorWiC. The proposed method leverages only raw texts sampled from Wikipedia, assuming no sense-annotated data, and learns context-aware word representations within a standard contrastive learning setup. We experiment with a series of standard and comprehensive WiC benchmarks across multiple languages. Our proposed fully unsupervised MirrorWiC models obtain substantial gains over off-the-shelf PLMs across all monolingual, multilingual and cross-lingual setups. Moreover, on some standard WiC benchmarks, MirrorWiC is even on-par with supervised models fine-tuned with in-task data and sense labels.
翻译:最近的工作表明,即使是通过简单的自我监督技术,诸如BERT和ROBERTA等经过事先训练的语言模型(PLM)也可以转化为有效的句子和字码编码器。受这一系列工作的启发,我们在本文件中提出一种完全不受监督的方法来改进PLM中文本(WIC)的表达方式,这是通过简单有效的WIC有针对性的微调程序(MiracWIC)实现的。拟议的方法只利用从Wikipedia抽取的原始文本,假设没有附加说明的数据,并在标准的对比学习结构中学习有背景的字义表达方式。我们试验了多种语言的一套标准和全面的 WIC基准。我们提议的完全不受监督的Mira Wic Wic模型在所有单语、多语和跨语种的单语种设置中,在现成的PLMs上取得了巨大收益。此外,在一些标准WIC基准上,MiracWIC甚至与受监督的模型同时以内式数据和感官标签进行微调。