We are interested in dealing with the heterogeneity of Knowledge bases (KBs), e.g., ontologies and schemas, modeled as sets of entity types (etypes), e.g., person, where each etype is associated with a set of properties, e.g., age or height, via an inheritance hierarchy. A huge literature exists on this topic. A common approach is to model KBs as graphs decorated with labels and reduce the problem of KB matching to that of matching these two elements, \textit{viz.}, labels and structure of the graph. However, labels of etypes are often misplaced, e.g., they are more general or specific than the correct etype, as defined by its properties. Structure-based matching may also lead to wrong conclusions as the properties assigned to an etype in an inheritance hierarchy do not depend on the order by which they are assigned and, therefore, on the specific structure of the graph. In this paper, we propose a novel etype graph matching approach, dealing with the two problems highlighted above, based on two key ideas. The first is to implement matching as a classification task where etypes are characterized by the associated properties. The second is we propose two \textit{property-based} etype similarity metrics, which model the roles that properties have in the definition of an etype. The experimental results show the effectiveness of the algorithm, in particular for those etype graphs with a high number of properties.
翻译:我们有兴趣通过继承等级处理知识基础(KBs)的异质性,例如,以实体类型(etype)的一组类型(etype)为模型,例如,人,每个电子类型与一系列属性(例如年龄或高度)相关联。关于这个主题,有大量文献存在。一个共同的方法是将KBs建模成带有标签的图表,并减少KB匹配与这两个元素(\textit{viz.}、标签和图表结构匹配的问题。然而,电子类型标签往往错位,例如,它们比其属性定义的正确的电子类型更为一般或具体。基于结构的匹配还可能导致错误的结论,因为赋予电子类型在继承等级中的属性并不取决于它们被指派的顺序,因此也取决于图表的具体结构。在本文件中,我们建议一种新型图表匹配方法,处理上面所突出的两个电子类型(e)问题,在两种类型中,我们用两种类型类型类型来显示一个关键格式的属性,我们用两种类型来匹配。