A powerful way to understand a complex query is by observing how it operates on data instances. However, specific database instances are not ideal for such observations: they often include large amounts of superfluous details that are not only irrelevant to understanding the query but also cause cognitive overload; and one specific database may not be enough. Given a relational query, is it possible to provide a simple and generic "representative" instance that (1) illustrates how the query can be satisfied, (2) summarizes all specific instances that would satisfy the query in the same way by abstracting away unnecessary details? Furthermore, is it possible to find a collection of such representative instances that together completely characterize all possible ways in which the query can be satisfied? This paper takes initial steps towards answering these questions. We design what these representative instances look like, define what they stand for, and formalize what it means for them to satisfy a query in "all possible ways." We argue that this problem is undecidable for general domain relational calculus queries, and develop practical algorithms for computing a minimum collection of such instances subject to other constraints. We evaluate the efficiency of our approach experimentally, and show its effectiveness in helping users debug relational queries through a user study.
翻译:了解复杂查询的有力方式是观察它如何在数据实例上运作。然而,具体数据库实例对于这种观察并不理想:它们往往包括大量多余的细节,这些细节不仅与理解查询无关,而且导致认知超载;而且一个特定数据库可能不够。鉴于一个关联查询,能否提供一个简单和通用的“代表”实例,以便(1) 说明查询如何得到满足,(2) 总结所有具体实例,通过抽取不必要的细节,以同样的方式满足查询?此外,是否有可能找到一个这种代表性实例的集合,以便共同全面描述查询可以满足的所有可能方式?本文为回答这些问题采取了初步步骤。我们设计这些代表性实例,界定它们代表的特征,并正式确定它们意味着什么能用“所有可能的方法”满足查询。我们争辩说,对于通用域关系微量查询来说,这一问题是无法判断的,并且为计算受其他制约的这类案件的最低收集量而制定实用的算法。我们用实验方法评估了我们的方法的效率,并展示了它通过用户研究帮助用户解调关系查询的有效性。