改进了通过 " 巴克丁 " 策略组合和覆盖的滑动窗口的分类和覆盖的窗口比值 (Improved Sliding Window Algorithms for Clustering and Coverage via Bucketing-Based Sketches)

Streaming computation plays an important role in large-scale data analysis. The sliding window model is a model of streaming computation which also captures the recency of the data. In this model, data arrives one item at a time, but only the latest $W$ data items are considered for a particular problem. The goal is to output a good solution at the end of the stream by maintaining a small summary during the stream. In this work, we propose a new algorithmic framework for designing efficient sliding window algorithms via bucketing-based sketches. Based on this new framework, we develop space-efficient sliding window algorithms for $k$-cover, $k$-clustering and diversity maximization problems. For each of the above problems, our algorithm achieves $(1\pm \varepsilon)$-approximation. Compared with the previous work, it improves both the approximation ratio and the space.

翻译：串流计算在大规模数据分析中起着重要作用。滑动窗口模型是一个流式计算模型, 也可以捕捉数据的耐受性。在这个模型中, 数据一次到达一个项目, 但只有最新的美元数据项目才考虑特定问题。目标是在流中保持一个小摘要, 在流中在流的尾端输出一个好的解决办法。在这项工作中, 我们提出一个新的算法框架, 用于通过以桶桶为基础的草图设计高效的滑动窗口算法。基于这个新框架, 我们为 $- 覆盖、 $- 组合和多样性最大化问题开发空间高效滑动窗口算法。对于上述每一个问题, 我们的算法都实现了$( 1\ pm \ varepsilon)- 匹配。与先前的工作相比, 它改进了近似率和空间。