Threshold queries are an important class of queries that only require computing or counting answers up to a specified threshold value. To the best of our knowledge, threshold queries have been largely disregarded in the research literature, which is surprising considering how common they are in practice. In this paper, we present a deep theoretical analysis of threshold query evaluation and show that thresholds can be used to significantly improve the asymptotic bounds of state-of-the-art query evaluation algorithms. We also empirically show that threshold queries are significant in practice. In surprising contrast to conventional wisdom, we found important scenarios in real-world data sets in which users are interested in computing the results of queries up to a certain threshold, independent of a ranking function that orders the query results by importance.
翻译:阈值查询是一种重要的查询类别,只需要计算或计算不超过特定临界值的答案。 据我们所知,在研究文献中基本忽略了阈值查询,这是令人惊讶的,因为考虑到这些查询在实践中有多常见。在本文中,我们对阈值查询评估进行了深入的理论分析,并表明阈值可用于大大改进最新查询评估算法的无药可循的界限。我们还从经验上表明,阈值查询在实践中很重要。与传统智慧形成鲜明对照的是,我们发现现实世界数据集中存在一些重要的情景,即用户有兴趣计算最高至某一阈值的查询结果,而不受按重要性排序查询结果的排序功能的影响。