We study streaming algorithms in the white-box adversarial model, where the stream is chosen adaptively by an adversary who observes the entire internal state of the algorithm at each time step. We show that nontrivial algorithms are still possible. We first give a randomized algorithm for the $L_1$-heavy hitters problem that outperforms the optimal deterministic Misra-Gries algorithm on long streams. If the white-box adversary is computationally bounded, we use cryptographic techniques to reduce the memory of our $L_1$-heavy hitters algorithm even further and to design a number of additional algorithms for graph, string, and linear algebra problems. The existence of such algorithms is surprising, as the streaming algorithm does not even have a secret key in this model, i.e., its state is entirely known to the adversary. One algorithm we design is for estimating the number of distinct elements in a stream with insertions and deletions achieving a multiplicative approximation and sublinear space; such an algorithm is impossible for deterministic algorithms. We also give a general technique that translates any two-player deterministic communication lower bound to a lower bound for {\it randomized} algorithms robust to a white-box adversary. In particular, our results show that for all $p\ge 0$, there exists a constant $C_p>1$ such that any $C_p$-approximation algorithm for $F_p$ moment estimation in insertion-only streams with a white-box adversary requires $\Omega(n)$ space for a universe of size $n$. Similarly, there is a constant $C>1$ such that any $C$-approximation algorithm in an insertion-only stream for matrix rank requires $\Omega(n)$ space with a white-box adversary. Our algorithmic results based on cryptography thus show a separation between computationally bounded and unbounded adversaries. (Abstract shortened to meet arXiv limits.)
翻译:我们研究白色框$的对抗性模式中的流算法, 白箱对流由每步观察整个内部算法状态的对手来适应性地选择。 我们显示非三重算法仍然是可能的。 我们首先为美元1美元重击打字机提供一个随机化的算法, 它比长流的最佳确定性 Misra- Gries 算法要快得多。 如果白箱对流是计算性的, 我们使用白箱对流技术来减少我们$1美元重击算法的记忆力。 我们使用白箱对价1美元重击算法更进一步, 并且为图表、 线性算法设计一些额外的算法。 这种算法的存在令人吃惊, 因为流算法甚至没有在这个模型中有一个秘密密钥, 也就是对手完全知道。 我们设计的一种算法是用来估算流中不同元素的数量, 在那里插入并删除一个多复制的直线直径比值。 这样的算法对于确定性算法是不可能的。 我们用普通的算法来翻译任何稳定的平坦的 。