In grammar-based compression a string is represented by a context-free grammar, also called a straight-line program (SLP), that generates only that string. We refine a recent balancing result stating that one can transform an SLP of size $g$ in linear time into an equivalent SLP of size $O(g)$ so that the height of the unique derivation tree is $O(\log N)$ where $N$ is the length of the represented string (FOCS 2019). We introduce a new class of balanced SLPs, called contracting SLPs, where for every rule $A \to \beta_1 \dots \beta_k$ the string length of every variable $\beta_i$ on the right-hand side is smaller by a constant factor than the string length of $A$. In particular, the derivation tree of a contracting SLP has the property that every subtree has logarithmic height in its leaf size. We show that a given SLP of size $g$ can be transformed in linear time into an equivalent contracting SLP of size $O(g)$ with rules of constant length. We present an application to the navigation problem in compressed unranked trees, represented by forest straight-line programs (FSLPs). We extend a linear space data structure by Reh and Sieber (2020) by the operation of moving to the $i$-th child in time $O(\log d)$ where $d$ is the degree of the current node. Contracting SLPs are also applied to the finger search problem over SLP-compressed strings where one wants to access positions near to a pre-specified finger position, ideally in $O(\log d)$ time where $d$ is the distance between the accessed position and the finger. We give a linear space solution where one can access symbols or move the finger in time $O(\log d + \log^{(t)} N)$ for any constant $t$ where $\log^{(t)} N$ is the $t$-fold logarithm of $N$. This improves a previous solution by Bille, Christiansen, Cording, and G{\o}rtz (2018) with access/move time $O(\log d + \log \log N)$.
翻译:在基于语法的压缩中,字符串代表的是不含上下文的语法( $N), 也称为直线程序( SLP), 仅生成该字符串。 我们改进了最近的一个平衡结果, 显示一个人可以将一个大小为$g的SLP( 线性时间) 转换成一个等值的SLP( 美元) $( g) 美元( g), 这样, 唯一衍生树的高度是$( 美元), 其中, 美元( 美元) 代表着一个平衡的 SLP( 美元) 。 我们引入了一个平衡的 SLP( 与 SLP ( 美元), 其中, 每个规则的直线性位值为$( 美元), 以直径( 美元) 直径( 美元) 直径( 美元), 以SL( 美元) 直径( SL) 的直径( 直径) 程序以恒定的直径( SL) 直径( 直径( 美元) 直径) 直径( 直径) 以SL) 直径( SL) 度( ro) 直) 直) 直( SL) 直( SL) 根( 直) 直) 度( SL) 根( 度( 直线性) 直) 直) 根( 根) 根) 根( 根) 根) 根( 根( 根( 根) 根( 根( 根) 根) 根( 根) 根) 根) 根) 根( 根( 根) 根) 根) 根( 根( 根) 根) 根( 根) 根( 根( 根) 根) 根( 根) 根( 根( 根( 根( 根( 根( 根) 根( 根( 根) 根) 根( ) 根) 根) 根( 根( 根( ) ) 根) 根( 根) 根( 根)