Consider a digraph $G = (V, E)$ where each vertex is unlabeled, each edge is labeled by a character in some alphabet $\Omega$, and any two edges with both the same head and the same tail have different labels. The powerset construction gives a transform of $G$ into a weakly connected digraph $G' = (V', E')$ that enables solving the decision problem of whether there is a walk in $G$ matching an arbitrarily long query string $q$ in time linear in $|q|$ and independent of $|E|$ and $|V|$. We show $G$ can be recovered from $G'$ when for every $v_\ell \in V$, there is some distinct string $s_\ell$ on $\Omega$ such that $v_\ell$ is the origin of a closed walk in $G$ matching $s_\ell$, and no other walk in $G$ matches $s_\ell$ unless it starts and ends at $v_\ell$. We then exploit this invertibility condition to strategically alter any $G$ so its transform $G'$ enables retrieval of all $t$ terminal vertices of walks in the unaltered $G$ matching $q$ in $O(|q| + t \log |V|)$ time. We conclude by proposing two defining properties of a class of transforms that includes the Burrows-Wheeler transform and the transform presented here.
翻译:将每个顶端不贴标签的 $G = (V, E) 考虑一个调值 $G = (V, E) = (V, 美元) = (G) = (V, E) = 美元, 每一顶端在某种字母中标注一个字符 $\ Omega$, 而任何两个顶端与同一头和尾都有不同的标签。 电源设置的构造将G $ 转换成一个连接薄弱的调值 $G = (V, E) = (V, 美元), 从而解决以下决定问题: 是否有一条用G $ 匹配任意长查询字符串的行走, 时间线在$ $ = $ 美元 和 $ $ $ 美元 和 $ $ + V $ ++++ $ $ 美元 。 我们随后将这个变换G $ 的 值 = = 美元 美元 =xx = 美元 美元 折价 折价 = = = 美元