Incorporating lexical knowledge into deep learning models has been proved to be very effective for sequence labeling tasks. However, previous works commonly have difficulty dealing with large-scale dynamic lexicons which often cause excessive matching noise and problems of frequent updates. In this paper, we propose DyLex, a plug-in lexicon incorporation approach for BERT based sequence labeling tasks. Instead of leveraging embeddings of words in the lexicon as in conventional methods, we adopt word-agnostic tag embeddings to avoid re-training the representation while updating the lexicon. Moreover, we employ an effective supervised lexical knowledge denoising method to smooth out matching noise. Finally, we introduce a col-wise attention based knowledge fusion mechanism to guarantee the pluggability of the proposed framework. Experiments on ten datasets of three tasks show that the proposed framework achieves new SOTA, even with very large scale lexicons.
翻译:实践证明,将词汇学知识纳入深层学习模式对于排序标签任务非常有效。然而,以往的工作通常难以处理大规模动态词汇,这往往会造成过多的噪音和频繁更新的问题。在本文件中,我们提议DyLex,即基于BERT的序列标签任务的一种插头词汇集成法。我们不象常规方法那样利用在词典中嵌入词的功能,而是采用字性标签嵌入,以避免在更新词汇表时对代表进行再培训。此外,我们采用了有效的受监督的词汇分解法,以平滑匹配噪音。最后,我们引入了基于共同关注的知识聚合机制,以保证拟议框架的可插入性。对三项任务的十个数据集的实验表明,拟议框架实现了新的SOTA,即使规模非常大。