Suppose we are given a text $T [1..n]$, a straight-line program with $g$ rules for $T$ and an assignment of tags to the characters in $T$ such that the Burrows-Wheeler Transform of $T$ has $r$ runs, the Burrows-Wheeler Transform of the reverse of $T$ has $\bar{r}$ runs and the tag array -- the list of tags in the lexicographic order of the suffixes starting at the characters the tags are assigned to -- has $t$ runs. If the alphabet size is at most polylogarithmic in $n$ then there is an $O (r + \bar{r} + g + t)$-space index for $T$ such that when we are given a pattern $P [1..m]$ we can compute the maximal exact matches (MEMs) of $P$ with respect to $T$ in $O (m)$ time plus $O (\log n)$ time per MEM and then list the distinct tags assigned to the first characters of occurrences of that MEM in constant time per tag listed, all correctly with high probability.
翻译:暂无翻译