In the $d$-dimensional turnstile streaming model, a frequency vector $\mathbf{x}=(\mathbf{x}(1),\ldots,\mathbf{x}(n))\in (\mathbb{R}^d)^n$ is updated entry-wisely over a stream. We consider the problem of \emph{$f$-moment estimation} for which one wants to estimate $$f(\mathbf{x})=\sum_{v\in[n]}f(\mathbf{x}(v))$$ with a small-space sketch. In this work we present a simple and generic scheme to construct sketches with the novel idea of hashing indices to \emph{L\'evy processes}, from which one can estimate the $f$-moment $f(\mathbf{x})$ where $f$ is the \emph{characteristic exponent} of the L\'evy process. The fundamental \emph{L\'evy-Khintchine{} representation theorem} completely characterizes the space of all possible characteristic exponents, which in turn characterizes the set of $f$-moments that can be estimated by this generic scheme. The new scheme has strong explanatory power. It unifies the construction of many existing sketches ($F_0$, $L_0$, $L_2$, $L_\alpha$, $L_{p,q}$, etc.) and it implies the tractability of many nearly periodic functions that were previously unclassified. Furthermore, the scheme can be conveniently generalized to multidimensional cases ($d\geq 2$) by considering multidimensional L\'evy processes and can be further generalized to estimate \emph{heterogeneous moments} by projecting different indices with different L\'evy processes. We conjecture that the set of tractable functions can be characterized using the L\'evy-Khintchine representation theorem via what we called the \emph{Fourier-Hahn-L\'evy} method.
翻译:暂无翻译