Biomedical word embeddings are usually pre-trained on free text corpora with neural methods that capture local and global distributional properties. They are leveraged in downstream tasks using various neural architectures that are designed to optimize task-specific objectives that might further tune such embeddings. Since 2018, however, there is a marked shift from these static embeddings to contextual embeddings motivated by language models (e.g., ELMo, transformers such as BERT, and ULMFiT). These dynamic embeddings have the added benefit of being able to distinguish homonyms and acronyms given their context. However, static embeddings are still relevant in low resource settings (e.g., smart devices, IoT elements) and to study lexical semantics from a computational linguistics perspective. In this paper, we jointly learn word and concept embeddings by first using the skip-gram method and further fine-tuning them with correlational information manifesting in co-occurring Medical Subject Heading (MeSH) concepts in biomedical citations. This fine-tuning is accomplished with the BERT transformer architecture in the two-sentence input mode with a classification objective that captures MeSH pair co-occurrence. In essence, we repurpose a transformer architecture (typically used to generate dynamic embeddings) to improve static embeddings using concept correlations. We conduct evaluations of these tuned static embeddings using multiple datasets for word relatedness developed by previous efforts. Without selectively culling concepts and terms (as was pursued by previous efforts), we believe we offer the most exhaustive evaluation of static embeddings to date with clear performance improvements across the board. We provide our code and embeddings for public use for downstream applications and research endeavors: https://github.com/bionlproc/BERT-CRel-Embeddings
翻译:生物医学嵌入词通常是在自由文本公司上预先训练的,这些动态嵌入法具有神经功能,可以捕捉本地和全球分布属性。它们被用于下游任务,使用各种神经结构,这些结构旨在优化特定任务的目标,以进一步调控嵌入。然而,自2018年以来,从这些静态嵌入到由语言模型(例如ELMO、BERT等变压器和ULMFiT)驱动的背景嵌入,发生了明显的变化。这些动态嵌入增加了以下好处:能够区分本地和全球分布属性。但是,固定嵌入在下游任务中使用了各种神经结构,这些结构在低资源设置(例如智能设备、 IoT 元素)中仍然具有相关性,并且从计算语言语言学角度来研究词汇和概念嵌入。我们首先使用跳格方法,然后用相关信息进一步调整它们。我们通过生物医学直线(MESH) 改进概念,在生物伦理引用中,这种微缩嵌入与BERGreal-deal努力在使用先前的直流结构结构中生成了一种前的直流数据。