Approximate query processing over dynamic databases, i.e., under insertions/deletions, has applications ranging from high-frequency trading to internet-of-things analytics. We present JanusAQP, a new dynamic AQP system, which supports SUM, COUNT, AVG, MIN, and MAX queries under insertions and deletions to the dataset. JanusAQP extends static partition tree synopses, which are hierarchical aggregations of datasets, into the dynamic setting. This paper contributes new methods for: (1) efficient initialization of the data synopsis in the presence of incoming data, (2) maintenance of the data synopsis under insertions/deletions, and (3) re-optimization of the partitioning to reduce the approximation error. JanusAQP reduces the error of a state-of-the-art baseline by more than 60% using only 10% storage cost. JanusAQP can process more than 100K updates per second in a single node setting and keep the query latency at a millisecond level.
翻译:对动态数据库的近似查询处理,即插入/删除中的动态数据库,其应用范围从高频交易到互联网内容分析。我们介绍一个支持SUM、COUNT、AVG、MIN和MAX的插入和删除中的新的动态AQP系统,即JanusAQP,将静态分区树合成(即数据集的等级组合)延伸至动态设置。本文提供了新方法:(1)在存取数据时高效初始化数据概要,(2)在插入/删除下维护数据概要,(3)为减少近似误差重新优化分区。JanusAQP仅用10%的存储成本将最新基线的误差减少60%以上。JanusAQP可在单一节点设置中处理每秒100K以上的更新,并将查询拉长保持在毫秒的水平上。