Language Models (LMs) have performed well on biomedical natural language processing applications. In this study, we conducted some experiments to use prompt methods to extract knowledge from LMs as new knowledge Bases (LMs as KBs). However, prompting can only be used as a low bound for knowledge extraction, and perform particularly poorly on biomedical domain KBs. In order to make LMs as KBs more in line with the actual application scenarios of the biomedical domain, we specifically add EHR notes as context to the prompt to improve the low bound in the biomedical domain. We design and validate a series of experiments for our Dynamic-Context-BioLAMA task. Our experiments show that the knowledge possessed by those language models can distinguish the correct knowledge from the noise knowledge in the EHR notes, and such distinguishing ability can also be used as a new metric to evaluate the amount of knowledge possessed by the model.
翻译:语言模型(LMS)在生物医学自然语言处理应用方面表现良好。在这项研究中,我们进行了一些实验,以便利用迅速的方法从LMS获取知识,作为新的知识库(LMs作为KBs ) 。然而,催化只能作为知识提取的低约束线,在生物医学领域的表现特别差。为了使LMs作为KBs更符合生物医学领域的实际应用情景,我们特别增加了EHR注释,作为迅速改进生物医学领域低约束线的背景。我们设计并验证了我们动态-Context-BioLAMA任务的一系列实验。我们的实验表明,这些语言模型所拥有的知识能够将正确的知识与EHR笔记中的噪音知识区分开来,这种区分能力也可以用作评估模型所拥有知识量的新指标。