TILT: 流质查询最优化和平行化的时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间-时间 (TiLT: A Time-Centric Approach for Stream Query Optimization and Parallelization)

Stream processing engines (SPEs) are widely used for large scale streaming analytics over unbounded time-ordered data streams. Modern day streaming analytics applications exhibit diverse compute characteristics and demand strict latency and throughput requirements. Over the years, there has been significant attention in building hardware-efficient stream processing engines (SPEs) that support several query optimization, parallelization, and execution strategies to meet the performance requirements of large scale streaming analytics applications. However, in this work, we observe that these strategies often fail to generalize well on many real-world streaming analytics applications due to several inherent design limitations of current SPEs. We further argue that these limitations stem from the shortcomings of the fundamental design choices and the query representation model followed in modern SPEs. To address these challenges, we first propose TiLT, a novel intermediate representation (IR) that offers a highly expressive temporal query language amenable to effective query optimization and parallelization strategies. We subsequently build a compiler backend for TiLT that applies such optimizations on streaming queries and generates hardware-efficient code to achieve high performance on multi-core stream query executions. We demonstrate that TiLT achieves up to 326x (20.49x on average) higher throughput compared to state-of-the-art SPEs (e.g., Trill) across eight real-world streaming analytics applications. TiLT source code is available at https://github.com/ampersand-projects/tilt.git.

翻译：流动处理引擎(SPE)被广泛用于大规模流动分析,用于对无限制的时间顺序数据流进行大规模流动分析。现代日间流动分析应用具有不同的计算特性,要求严格的悬浮和吞吐要求。多年来,在建设硬件高效流处理引擎(SPEs)方面一直受到极大关注,这些引擎支持若干查询优化、平行化和执行战略,以满足大规模流动分析应用的性能要求。然而,在这项工作中,我们发现这些战略往往无法在许多真实世界流动分析应用中很好地推广。由于当前 SPE的一些内在设计限制,现代流动分析应用呈现出不同的计算特征,要求严格的静态和输送要求。为了应对这些挑战,我们首先建议TILT, 一个新的中间代表(IR) 提供一种非常清晰的时间查询语言,可以进行有效的查询优化和平行化战略。我们随后为 TiLT 建立一个汇编器源,在流动查询中应用这样的优化,并生成硬件高效的代码。我们进一步论证这些限制源于基本设计选择选择选择的缺陷和现代SPE/ 高级流流流流流- 实现高端点的高级流流执行。(20LTLT-LT) 在高端流中,通过流流流/高端的流流流中,通过流流流流流流流中,通过SLTLTLT-real-real-real-real-real-real-real-real-real-real-tal-real-rex实现高分。