The emergence of programmable data-plane targets has motivated a new hybrid design for network streaming analytics systems that combine these targets' fast packet processing speeds with the rich compute resources available at modern stream processors. However, these systems require careful query planning; that is, specifying the minute details of executing a given set of queries in a way that makes the best use of the limited resources and programmability offered by data-plane targets. We use such an existing system, Sonata, and real-world packet traces to understand how executing a fixed query workload is affected by the unknown dynamics of the traffic that defines the target's input workload. We observe that static query planning, as employed by Sonata, cannot handle even small changes in the input workload, wasting data-plane resources to the point where query execution is confined mainly to userspace. This paper presents the design and implementation of DynamiQ, a new network streaming analytics platform that employs dynamic query planning to deal with the dynamics of real-world input workloads. Specifically, we develop a suite of practical algorithms for (i) computing effective initial query plans (to start query execution) and (ii) enabling efficient updating of portions of such an initial query plan at runtime (to adapt to changes in the input workload). Using real-world packet traces as input workload, we show that compared to Sonata, DynamiQ reduces the stream processor's workload by two orders of magnitude.
翻译:编程数据机目标的出现促使为网络流动分析系统设计了新的混合设计,将目标的快速包处理速度与现代流动处理器的丰富计算资源结合起来,这些系统需要仔细的查询规划,即说明执行一组特定查询的细微细节,以便最好地利用数据机目标提供的有限资源和可编程性。我们使用一个现有的系统Sonata和真实世界数据包痕迹来了解固定查询工作量的执行如何受到确定目标投入工作量的交通不为人知的动态的影响。我们观察到Sonata使用的静态查询规划甚至不能处理投入工作量的微小变化,浪费数据平板资源到查询执行主要局限于用户空间的点。本文介绍了DynamiQ的设计和实施,这是一个新的网络流分析平台,利用动态查询规划处理现实世界投入工作量的动态。具体地说,我们开发了一套实用的算法,用于(i)计算有效初始查询计划的初始工作量,我们使用初始查询计划的运行时间序列,(开始查询)到运行世界输入流程的流程的流程的流程,我们使用这种输入流程的流程的流程的流程,通过运行到运行流程的流程的流程的流程的流程的流程,以显示流程的流程的流程的流程的流程的流程的流程的流程的流程。