Recent work measures how much offset-value coding speeds up database query operations. It speeds up not only sorting but also duplicate removal, grouping (aggregation) in sorted streams, order-preserving exchange (shuffle), and merge join. It already saves thousands of CPUs in Google's Napa and F1~Query systems, e.g., in grouping algorithms and in log-structured merge-forests. In order to achieve the full benefits of interesting orderings, however, query execution algorithms must not only consume and exploit offset-value codes but also provide offset-value codes to the next operation in the pipeline. This short paper describes in detail how order-preserving algorithms (from filter to merge join and even shuffle) can compute offset-value codes for their outputs. These calculations are surprisingly simple and very efficient.
翻译:最近的工作量度了多少抵消值编码可以加速数据库查询操作。 它不仅加快了分类, 而且还加快了分类流、 命令保存交换( shuffle) 和合并的删除( 集合) 的重复清除( 集合) 。 它已经在 Google 的 Napa 和 F1 { ⁇ uery 系统中节省了数千个 CPU, 例如在 组合算法 和 日志结构合并- 森林中 。 但是, 为了实现有趣的排序的全部好处, 查询执行算法不仅必须消耗和开发抵消值代码, 还必须为管道中的下一个操作提供抵消值代码 。 这份简短的论文详细描述了排序算法( 从过滤器到合并连连连) 如何计算其输出的抵消值代码 。 这些计算是惊人的简单和非常有效的 。