Currently, there is a rapidly increasing need for high-quality biomedical knowledge graphs (BioKG) that provide direct and precise biomedical knowledge. In the context of COVID-19, this issue is even more necessary to be highlighted. However, most BioKG construction inevitably includes numerous conflicts and noises deriving from incorrect knowledge descriptions in literature and defective information extraction techniques. Many studies have demonstrated that reasoning upon the knowledge graph is effective in eliminating such conflicts and noises. This paper proposes a method BioGRER to improve the BioKG's quality, which comprehensively combines the knowledge graph embedding and logic rules that support and negate triplets in the BioKG. In the proposed model, the BioKG refinement problem is formulated as the probability estimation for triplets in the BioKG. We employ the variational EM algorithm to optimize knowledge graph embedding and logic rule inference alternately. In this way, our model could combine efforts from both the knowledge graph embedding and logic rules, leading to better results than using them alone. We evaluate our model over a COVID-19 knowledge graph and obtain competitive results.
翻译:目前,对提供直接和准确生物医学知识的高质量生物医学知识图(BioKG)的需求迅速增加,在COVID-19的背景下,这一问题更加需要强调,然而,大多数BioKG的建筑工程不可避免地包括来自文献中不正确的知识描述和信息提取技术缺陷的众多冲突和噪音。许多研究表明,对知识图的推理对于消除这种冲突和噪音是有效的。本文件提出了一个方法BioGER来提高BioKG的质量,该方法综合了支持和否定BioKG三胞胎的知识图嵌入和逻辑规则。在拟议的模型中,BioKG的精细化问题被确定为BioKG三胞胎的概率估计。我们采用变式电子算法来优化知识图嵌入和逻辑推导。这样,我们的模型可以将知识图嵌入和逻辑规则结合起来,从而产生比仅仅使用它们更好的结果。我们评估了我们关于COVID-19知识图的模型,并获得了竞争性的结果。