We introduce BioCoM, a contrastive learning framework for biomedical entity linking that uses only two resources: a small-sized dictionary and a large number of raw biomedical articles. Specifically, we build the training instances from raw PubMed articles by dictionary matching and use them to train a context-aware entity linking model with contrastive learning. We predict the normalized biomedical entity at inference time through a nearest-neighbor search. Results found that BioCoM substantially outperforms state-of-the-art models, especially in low-resource settings, by effectively using the context of the entities.
翻译:我们引入BioCome,这是生物医学实体的对比学习框架,它只使用两种资源:小型字典和大量生物医学原始文章。具体地说,我们通过字典匹配,从原始PubMed文章中建立培训实例,并用这些实例来培训背景意识实体,将模型与对比学习联系起来。我们预测通过近邻搜索,生物医学实体在推论时间实现正常化。结果发现BioCome通过有效利用实体的背景,大大优于最先进的模型,特别是在低资源环境中。