Are We Still Missing an Item?

The missing item problem, as introduced by Stoeckl in his work at SODA 23, focuses on continually identifying a missing element $e$ in a stream of elements ${e_1, ..., e_{\ell}}$ from the set $\{1,2,...,n\}$, such that $e \neq e_i$ for any $i \in \{1,...,n\}$. Stoeckl's investigation primarily delves into scenarios with $\ell<n$, providing bounds for the (i) deterministic case, (ii) the static case -- where the algorithm might be randomized but the stream is fixed in advanced) and (iii) the adversarially robust case -- where the algorithm is randomized and each stream element can be chosen depending on earlier algorithm outputs. Building upon this foundation, our paper addresses previously unexplored aspects of the missing item problem. In the first segment, we examine the static setting with a long stream, where the length of the steam $\ell$ is close to or even exceeds the size of the universe $n$. We present an algorithm demonstrating that even when $\ell$ is very close to $n$ (say $\ell=n-1$), polylog($n$) bits of memory suffice to identify the missing item. Additionally, we establish tight bounds of $\tilde{\Theta(k)}$ for the scenario of $\ell = n+k$. The second segment of this part of our work focuses on the {\em adversarially robust setting}. We show a lower bound for a pseudo-deterministic error-zero (where the algorithm reports its errors) algorithm of approximating $\Omega(\ell)$, up to polylog factors. Based on Stoeckl's work, we establish a lower bound for a random-start (only use randomness at initialization) error-zero streaming algorithm. In the final segment, we explore streaming algorithms with randomness-on-the-fly, where the random bits that are saved for future use are included in the space cost. For streams with length $\ell = O(\sqrt{n})$, we provide an upper bound of $O(log n)$. This establishes a gap between randomness-on-the-fly to random-start.

翻译：暂无翻译