We study the complexity of constructing an optimal parsing $\varphi$ of a string ${\bf s} = s_1 \dots s_n$ under the constraint that given a position $p$ in the original text, and the LZ76-like (Lempel Ziv 76) encoding of $T$ based on $\varphi$, it is possible to identify/decompress the character $s_p$ by performing at most $c$ accesses to the LZ encoding, for a given integer $c.$ We refer to such a parsing $\varphi$ as a $c$-bounded access LZ parsing or $c$-BLZ parsing of ${\bf s}.$ We show that for any constant $c$ the problem of computing the optimal $c$-BLZ parsing of a string, i.e., the one with the minimum number of phrases, is NP-hard and also APX hard, i.e., no PTAS can exist under the standard complexity assumption $P \neq NP.$ We also study the ratio between the sizes of an optimal $c$-BLZ parsing of a string ${\bf s}$ and an optimal LZ76 parsing of ${\bf s}$ (which can be greedily computed in polynomial time).
翻译:暂无翻译