We address the joint optimization of multiple stream joins in a scale-out architecture by tailoring prior work on multi-way stream joins to predicate-driven data partitioning schemes. We present an integer linear programming (ILP) formulation for selecting the partitioning and tuple routing with minimal probe load and describe how routing and operator placement can be rewired dynamically at changing data characteristics and arrival or expiration of queries. The presented algorithms and optimization schemes are implemented in CLASH, a data stream processor developed in our group that translates queries to deployable Apache Storm topologies after optimization. The experiments conducted over real-world data exhibit the potential of multi-query optimization of multi-way stream joins and the effectiveness and feasibility of the ILP optimization problem.
翻译:我们通过调整以前关于多路流的工作与上游驱动的数据分隔计划相结合,解决多流联合优化结合规模扩大结构的问题。我们提出了一个整数线性编程(ILP)配方,用于以最小的探测载荷选择分区和线性线性线性线性线性线性线性线性线性编程(ILP)配方,并描述如何在不断变化的数据特征和查询的到来或到期时动态地将路由和操作者安排重新连接在一起。提出的算法和优化计划在CLASH中实施。 CLASH是本组中开发的一个数据流处理器,在优化后将查询翻译成可部署的阿帕奇风暴地形。在现实世界数据上进行的实验展示了多路流连接多路性优化的潜力以及ILP优化问题的有效性和可行性。