Large knowledge graphs like DBpedia and YAGO are always based on the same source, i.e., Wikipedia. But there are more wikis that contain information about long-tail entities such as wiki hosting platforms like Fandom. In this paper, we present the approach and analysis of DBkWik++, a fused Knowledge Graph from thousands of wikis. A modified version of the DBpedia framework is applied to each wiki which results in many isolated Knowledge Graphs. With an incremental merge based approach, we reuse one-to-one matching systems to solve the multi source KG matching task. Based on this alignment we create a consolidated knowledge graph with more than 15 million instances.
翻译:DBpedia 和 YAGO 等大型知识图形总是基于同一来源, 即 Wikipedia 和 YAGO 。 但是, 更多的维基百科包含长尾实体的信息, 如 Fandom 等维基托管平台 。 在此文件中, 我们展示了 DBkWik++ 的方法和分析, 这是来自千千维基的集成知识图 。 DBpedia 框架的修改版本应用到每个维基, 导致许多孤立的知识图形 。 以递增合并为基础的方法, 我们重新使用一对一匹配系统来解决多源 KG 匹配任务 。 基于此匹配, 我们创建了超过 1500万 例的综合知识图表 。