与项目记录和档案管理一起的工作效率查询评价 (Work-Efficient Query Evaluation with PRAMs)

from arxiv, This paper, without the appendix, has been accepted for publication in the proceedings of the 26th International Conference on Database Theory (ICDT 2023)

The paper studies query evaluation in parallel constant time in the PRAM model. While it is well-known that all relational algebra queries can be evaluated in constant time on an appropriate CRCW-PRAM, this paper is interested in the efficiency of evaluation algorithms, that is, in the number of processors or, asymptotically equivalent, in the work. Naive evaluation in the parallel setting results in huge (polynomial) bounds on the work of such algorithms and in presentations of the result sets that can be extremely scattered in memory. The paper first discusses some obstacles for constant time PRAM query evaluation. It presents algorithms for relational operators that are considerably more efficient than the naive approaches. Further it explores three settings, in which efficient sequential query evaluation algorithms exist: acyclic queries, semi-join algebra queries, and join queries -- the latter in the worst-case optimal framework. Under natural assumptions on the representation of the database, the work of the given algorithms matches the best sequential algorithms in the case of semi-join queries, and it comes close in the other two settings. An important tool is the compaction technique from Hagerup (1992).

翻译：虽然众所周知,所有关系代数查询都可以在固定时间对适当的 CRCW-PRAM 进行定期评估,但本文件对评价算法的效率感兴趣,即处理器的数量,或工作中的零星等同。在平行环境下进行的评价的结果是,这种算法的工作存在巨大的(极性)界限,结果组的表述在记忆中可能极为分散。文件首先讨论了对经常时间 PRAM 查询评价的一些障碍。它为关系操作员提供了比天真的方法效率高得多的算法。文件还探讨了三种环境,在这些环境中,存在着高效的连续查询算法:自行车查询、半join代数查询,以及合并查询 -- -- 后者是在最坏情况下的最佳框架内进行的。根据关于数据库代表性的自然假设,给定的算法工作与半join查询中的最佳序列算法相匹配,在其他两种环境中也接近。一个重要工具是Hagerup 的压缩技术(1992年)。