分布式子网规范用于遍历网络 (Distributed Subweb Specifications for Traversing the Web)

Link Traversal-based Query Processing (ltqp), in which a sparql query is evaluated over a web of documents rather than a single dataset, is often seen as a theoretically interesting yet impractical technique. However, in a time where the hypercentralization of data has increasingly come under scrutiny, a decentralized Web of Data with a simple document-based interface is appealing, as it enables data publishers to control their data and access rights. While ltqp allows evaluating complex queries over such webs, it suffers from performance issues (due to the high number of documents containing data) as well as information quality concerns (due to the many sources providing such documents). In existing ltqp approaches, the burden of finding sources to query is entirely in the hands of the data consumer. In this paper, we argue that to solve these issues, data publishers should also be able to suggest sources of interest and guide the data consumer towards relevant and trustworthy data. We introduce a theoretical framework that enables such guided link traversal and study its properties. We illustrate with a theoretic example that this can improve query results and reduce the number of network requests. We evaluate our proposal experimentally on a virtual linked web with specifications and indeed observe that not just the data quality but also the efficiency of querying improves. Under consideration in Theory and Practice of Logic Programming (TPLP).

翻译：链接遍历式查询处理（link traversal-based query processing，LTQP）是一种在文档网络中而不是单一数据集中执行 SPARQL 查询的技术，通常被认为是理论上有趣但不实用的技术。然而，在数据的超级集中日益受到审视的时代，一个具有简单基于文档的接口的去中心化数据网络是具有吸引力的，它使数据发布者能够控制其数据和访问权限。尽管 LTQP 允许在这样的网络上评估复杂的查询，但它存在性能问题（由于包含数据的文档数量很高）以及信息质量的问题（由于提供此类文档的许多来源）。在现有的 LTQP 方法中，找到查询源的负担完全在数据使用者手中。在本文中，我们认为为解决这些问题，数据发布者也应该能够建议感兴趣的源，并引导数据使用者寻找相关和可信赖的数据。我们介绍了一个理论框架，可以实现这种引导式链接遍历，并研究了它的属性。我们通过理论示例说明，这可以改善查询结果并减少网络请求的数量。我们在具有规范的虚拟链接网络上实验性地评估了我们的提案，确实观察到查询的数据质量和效率均得到了提高。此论文正在考虑《逻辑编程的理论与实践》（Theory and Practice of Logic Programming，TPLP）。

相关内容

TPLP

关注 0

《逻辑程序设计理论与实践》是一本国际性的期刊，它发表的论著涵盖了逻辑程序设计的理论与实践。逻辑适用于人工智能和计算机科学的所有领域。逻辑编程是这些领域的基础。其中包括使用逻辑编程的人工智能应用程序、逻辑编程方法、系统规范、分析和验证、归纳逻辑编程、多关系数据挖掘、自然语言处理、知识表示、非单调推理、语义web推理、数据库，实现和架构以及约束逻辑编程。官网链接：https://www.cambridge.org/core/journals/theory-and-practice-of-logic-programming

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日