Complex Event Recognition (CER) systems are a prominent technology for finding user-defined query patterns over large data streams in real time. CER query evaluation is known to be computationally challenging, since it requires maintaining a set of partial matches, and this set quickly grows super-linearly in the number of processed events. We present CORE, a novel COmplex event Recognition Engine that focuses on the efficient evaluation of a large class of complex event queries, including time windows as well as the partition-by event correlation operator. This engine uses a novel evaluation algorithm that circumvents the super-linear partial match problem: under data complexity, it takes constant time per input event to maintain a data structure that compactly represents the set of partial matches and, once a match is found, the query results may be enumerated from the data structure with output-linear delay. We experimentally compare CORE against three state-of-the-art CER systems on both synthetic and real-world data. We show that (1) CORE's performance is not affected by the length of the stream, size of the query, or size of the time window, and (2) CORE outperforms the other systems by up to three orders of magnitude on different query workloads.
翻译:复杂事件识别(CER)系统是实时找到大型数据流用户定义查询模式的突出技术。 CER 查询评估在计算上具有挑战性, 因为它需要保持一组部分匹配, 而该数据集在经过处理的事件数量中迅速增长超线。 我们展示了CORE, 一个新的COplex事件识别引擎, 重点是有效评估一大批复杂事件查询, 包括时间窗口和分离事件相关操作员。 这个引擎使用一种新的评估算法, 绕过超级线性部分匹配问题: 在数据复杂度下, 需要每个输入事件持续的时间来维持一个数据结构, 以压缩代表部分匹配集, 一旦找到匹配, 查询结果可以从数据结构中列出, 产出线性延迟 。 我们实验性地将CORE 与合成数据和实际世界数据上的三个最先进的CER系统进行比较。 我们显示:(1) CORE 的性能不受数据流长度、 大小或时间窗口大小的影响, 并且(2) CORE 以不同程度的顺序将其他系统排出。