We study the sketching and communication complexity of deciding whether a binary sequence $x$ of length $n$ contains a binary sequence $y$ of length $k$ as a subsequence. We prove that this problem has a deterministic sketch of size $O(k \log k)$ and that any sketch has size $\Omega(k)$. We also give nearly tight bounds for the communication complexity of this problem, and extend most of our results to larger alphabets. Finally, we leverage ideas from our sketching lower bound to prove a lower bound for the VC dimension of a family of classifiers that are based on subsequence containment.
翻译:我们研究一个二进制序列($xxxxxxxx$n$)是否包含一个二进制序列($xxxxxxx$n$n$xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx