附加说明语法语法的数值 (Efficient Enumeration Algorithms for Annotated Grammars)

from arxiv, 54 pages. Full version with proofs of the article to appear at PODS'22. Except formatting and minor differences, this article contains all the contents of the PODS'22 article, plus the technical appendices

We introduce annotated grammars, an extension of context-free grammars which allows annotations on terminals. Our model extends the standard notion of regular spanners, and is more expressive than the extraction grammars recently introduced by Peterfreund. We study the enumeration problem for annotated grammars: fixing a grammar, and given a string as input, enumerate all annotations of the string that form a word derivable from the grammar. Our first result is an algorithm for unambiguous annotated grammars, which preprocesses the input string in cubic time and enumerates all annotations with output-linear delay. This improves over Peterfreund's result, which needs quintic time preprocessing to achieve this delay bound. We then study how we can reduce the preprocessing time while keeping the same delay bound, by making additional assumptions on the grammar. Specifically, we present a class of grammars which only have one derivation shape for all outputs, for which we can enumerate with quadratic time preprocessing. We also give classes that generalize regular spanners for which linear time preprocessing suffices.

翻译：我们引入了附加说明的语法, 这是一种无上下文语法的延伸, 允许对终端进行批注。我们的模型扩展了正常的射手的标准概念, 并且比Peterfreund最近推出的提取语法更能表达。我们研究了附加说明的语法的查点问题: 修补语法, 并给出一个字符串作为输入, 列举了从语法中产生一个单词的字符串的所有插图。我们的第一个结果是一个清晰的语法的算法, 它预处理立方时输入的输入字符串, 并用输出线延迟来罗列所有注释。这比Peterfreund 的结果要好得多, 而这需要 Quintict 时间预处理才能完成这一延迟约束。然后我们研究如何减少预处理时间, 同时保持相同的延迟, 在语法上附加假设。具体地说, 我们展示了一种语法的类语法, 它只有一种引出所有输出的形状, 我们可以用二次时间预处理来计算。我们还给一些班, 将常规的频波作概括, 直线前处理足够的时间。