Medical decision-making processes can be enhanced by comprehensive biomedical knowledge bases, which require fusing knowledge graphs constructed from different sources via a uniform index system. The index system often organizes biomedical terms in a hierarchy to provide the aligned entities with fine-grained granularity. To address the challenge of scarce supervision in the biomedical knowledge fusion (BKF) task, researchers have proposed various unsupervised methods. However, these methods heavily rely on ad-hoc lexical and structural matching algorithms, which fail to capture the rich semantics conveyed by biomedical entities and terms. Recently, neural embedding models have proved effective in semantic-rich tasks, but they rely on sufficient labeled data to be adequately trained. To bridge the gap between the scarce-labeled BKF and neural embedding models, we propose HiPrompt, a supervision-efficient knowledge fusion framework that elicits the few-shot reasoning ability of large language models through hierarchy-oriented prompts. Empirical results on the collected KG-Hi-BKF benchmark datasets demonstrate the effectiveness of HiPrompt.
翻译:医疗决策过程可以通过综合生物医学知识库的使用得到提升,这需要通过统一的索引系统融合来自不同来源的知识图谱。索引系统通常按照层次结构组织生物医学术语,以提供对齐实体的细粒度粒度。为应对生物医学知识融合(BKF)任务中稀少的监督挑战,研究人员已经提出了各种无监督方法。然而,这些方法严重依赖于特定的词汇和结构匹配算法,无法捕捉生物医学实 entiacute;体和术语传达的丰富语义。最近,神经嵌入模型在语义丰富的任务中证明了其有效性,但它们依赖于足够标记的数据进行训练。为弥合标记稀少的 BKF 和神经嵌入模型之间的差距,我们提出 HiPrompt,一种监督效率高的知识融合框架,通过层次结构导向的提示来引出大型语言模型的少量样本推理能力。对收集的 KG-Hi-BKF 基准数据集的实验结果证明了 HiPrompt 的有效性。