Ranking (or top-k) and skyline queries are the most popular approaches used to extract interesting data from large datasets. The first one is based on a scoring function to evaluate and rank tuples. Its computation is fast, but it is sensitive to the choice of the evaluating function. Skyline queries are based on the idea of dominance and the result is the set of all non-dominated tuples. This is a very interesting approach, but it can't allow to control the cardinality of the output. Recent researches discovered more techniques to compensate for these drawbacks. In particular, this paper will focus on the flexible skyline approach.
翻译:排名( 或上K) 和 天线查询是用来从大型数据集中提取有趣数据的最常用方法。 第一个方法基于评分函数来评估和排级 。 计算速度很快, 但对于评估函数的选择很敏感 。 天线查询基于主导性理念, 结果就是所有非主导性图例的组合 。 这是一个非常有趣的方法, 但无法控制输出的基点 。 最近的研究发现了更多的技术来弥补这些缺陷 。 特别是, 本文将侧重于灵活的天线方法 。