Knowledge graphs (KGs) have become the preferred technology for representing, sharing and adding knowledge to modern AI applications. While KGs have become a mainstream technology, the RDF/SPARQL-centric toolset for operating with them at scale is heterogeneous, difficult to integrate and only covers a subset of the operations that are commonly needed in data science applications. In this paper we present KGTK, a data science-centric toolkit designed to represent, create, transform, enhance and analyze KGs. KGTK represents graphs in tables and leverages popular libraries developed for data science applications, enabling a wide audience of developers to easily construct knowledge graph pipelines for their applications. We illustrate the framework with real-world scenarios where we have used KGTK to integrate and manipulate large KGs, such as Wikidata, DBpedia and ConceptNet.
翻译:知识图表(KGS)已成为现代AI应用中代表、分享和增加知识的首选技术。虽然KGS已成为主流技术,但与其大规模操作的RDF/SPARQL中心工具是多种多样的,难以整合,而且仅涵盖数据科学应用中通常需要的一组操作。本文介绍KGTK,这是一个以数据科学为中心的工具包,旨在代表、创建、转换、增强和分析KGs。KGTK在为数据科学应用开发的流行图书馆中代表了图表,并利用为数据科学应用开发的图书馆,使广大开发者能够方便地为其应用建立知识图表管道。我们用真实世界情景展示了框架,我们用KGTK来整合和操作大型KGs,例如维基数据、DBpedia和概念网络。