Knowledge Graphs, such as Wikidata, comprise structural and textual knowledge in order to represent knowledge. For each of the two modalities dedicated approaches for graph embedding and language models learn patterns that allow for predicting novel structural knowledge. Few approaches have integrated learning and inference with both modalities and these existing ones could only partially exploit the interaction of structural and textual knowledge. In our approach, we build on existing strong representations of single modalities and we use hypercomplex algebra to represent both, (i), single-modality embedding as well as, (ii), the interaction between different modalities and their complementary means of knowledge representation. More specifically, we suggest Dihedron and Quaternion representations of 4D hypercomplex numbers to integrate four modalities namely structural knowledge graph embedding, word-level representations (e.g.\ Word2vec, Fasttext), sentence-level representations (Sentence transformer), and document-level representations (sentence transformer, Doc2vec). Our unified vector representation scores the plausibility of labelled edges via Hamilton and Dihedron products, thus modeling pairwise interactions between different modalities. Extensive experimental evaluation on standard benchmark datasets shows the superiority of our two new models using abundant textual information besides sparse structural knowledge to enhance performance in link prediction tasks.
翻译:维基数据等知识图,包括结构和文字知识,以代表知识。两种模式中,两种模式的图形嵌入和语言模型都采用专门方法,每个模式都学习能够预测新结构知识的模式。很少有方法将学习和推断与两种模式结合起来,这些现有方法只能部分地利用结构和文字知识的相互作用。在我们的方法中,我们以单一模式的现有强力表述为基础,并使用超复合代数表达法代表(i)、单式嵌入以及(ii)不同模式及其互补的知识代表方式之间的相互作用。更具体地说,我们建议4D超相容数字的 Dihedron 和 Quaterenion 表达法,以整合四种模式,即结构图嵌入、字层表达法(例如:\ Word2vec、快文本)、句级表达法(感官变变变器)和文件级表达法(感官变变变变器、Doc2vec),我们的统一病媒代表法代表法则通过汉密尔顿和狄赫德产品衡量标签边缘边缘边缘的可信度,从而为4D超相相相匹配的表示,在结构预测模型上展示了我们不同的结构模型上的数据。在不同的结构模型上,在增加的模型上,用新的模型上展示了不同的实验性评估。