Although contextualized embeddings generated from large-scale pre-trained models perform well in many tasks, traditional static embeddings (e.g., Skip-gram, Word2Vec) still play an important role in low-resource and lightweight settings due to their low computational cost, ease of deployment, and stability. In this paper, we aim to improve word embeddings by 1) incorporating more contextual information from existing pre-trained models into the Skip-gram framework, which we call Context-to-Vec; 2) proposing a post-processing retrofitting method for static embeddings independent of training by employing priori synonym knowledge and weighted vector distribution. Through extrinsic and intrinsic tasks, our methods are well proven to outperform the baselines by a large margin.
翻译:----
尽管大规模预训练模型生成的上下文嵌入在许多任务中表现良好,但由于计算成本低、易于部署和稳定性等原因,传统的静态嵌入(例如Skip-gram、Word2Vec)在低资源和轻量级设置中仍然扮演着重要的角色。在本文中,我们旨在通过以下方式改进单词嵌入:1)将现有预训练模型中的更多上下文信息融入到Skip-gram框架中,即所谓的上下文向量;2)提出了一种后处理的强化学习方法,通过使用先前确定的近义词知识和加权向量分布来改善静态嵌入。通过内部和外部任务,我们的方法已经被证明超越了基准线。