Efficient range-summability (ERS) of a long list of random variables is a fundamental algorithmic problem that has applications to three important database applications, namely, data stream processing, space-efficient histogram maintenance (SEHM), and approximate nearest neighbor searches (ANNS). In this work, we propose a novel dyadic simulation framework and develop three novel ERS solutions, namely Gaussian-dyadic simulation tree (DST), Cauchy-DST and Random Walk-DST, using it. We also propose novel rejection sampling techniques to make these solutions computationally efficient. Furthermore, we develop a novel k-wise independence theory that allows our ERS solutions to have both high computational efficiencies and strong provable independence guarantees.
翻译:长长的随机变量清单(ERS)的高效测距(ERS)是一个基本的算法问题,它适用于三个重要的数据库应用,即数据流处理、空间高效直方图维护(SEHM)和近邻搜索(ANNS ) 。 在这项工作中,我们提出了一个新颖的dyadic模拟框架,并开发了三种新型ERS解决方案,即高山模拟树(DST ) 、 高山模拟树(Cauchy-DST ) 和随机漫游-DST,使用它。我们还提出了新的拒绝采样技术,以使这些解决方案具有计算效率。此外,我们开发了一种新的k-明智的独立理论,让我们的ERS解决方案既具有高计算效率和强大的可独立保障。