In this paper, we propose the first deterministic algorithms to solve the frequency estimation and frequent item problems in the bounded deletion model. We establish the space lower bound for solving the deterministic frequent items problem in the bounded deletion model, and propose the Lazy SpaceSaving$^\pm$ and SpaceSaving$^\pm$ algorithms with optimal space bound. We then develop an efficient implementation of the SpaceSaving$^\pm$ algorithm that minimizes the latency of update operations using novel data structures. The experimental evaluations testify that SpaceSaving$^\pm$ has accurate frequency estimations and achieves very high recall and precision across different data distributions while using minimal space. Our analysis and experiments clearly demonstrate that SpaceSaving$^\pm$ provides more accurate estimations using the same space as the state of the art protocols for applications with up to 93% of items deleted. Moreover, motivated by prior work, we propose Dyadic SpaceSaving$^\pm$, the first deterministic quantile approximation sketch in the bounded deletion model.
翻译:在本文中,我们提出了第一个确定性算法,以解决受约束删除模型中的频率估计和频繁项目问题。我们建立了用于解决受约束删除模型中确定性频繁项目问题的较低空间,并提出了Lazy SpaceSaving$pm$和空间Saving$pm$算法,这些算法具有最佳的空间约束。然后我们开发了一个高效的SpaceSave$pm$的算法,该算法将使用新数据结构进行更新操作的延迟性最小化。实验评估证明,SpaceSaving$pm$具有准确的频率估计,在使用最小空间的情况下,在不同数据分布中实现非常高的回溯和精确度。我们的分析和实验清楚地表明,SpaceSaving$pm$提供比艺术协议状况更准确的估算,用于多达93%的被删除的项目的应用。此外,我们基于先前的工作,我们提议在受约束的删除模型中提出DyadspaceSaving $\pm$,第一个确定性微缩缩图。