单词中不存在的子序列 (Absent Subsequences in Words)

An absent factor of a string $w$ is a string $u$ which does not occur as a contiguous substring (a.k.a. factor) inside $w$. We extend this well-studied notion and define absent subsequences: a string $u$ is an absent subsequence of a string $w$ if $u$ does not occur as subsequence (a.k.a. scattered factor) inside $w$. Of particular interest to us are minimal absent subsequences, i.e., absent subsequences whose every subsequence is not absent, and shortest absent subsequences, i.e., absent subsequences of minimal length. We show a series of combinatorial and algorithmic results regarding these two notions. For instance: we give combinatorial characterisations of the sets of minimal and, respectively, shortest absent subsequences in a word, as well as compact representations of these sets; we show how we can test efficiently if a string is a shortest or minimal absent subsequence in a word, and we give efficient algorithms computing the lexicographically smallest absent subsequence of each kind; also, we show how a data structure for answering shortest absent subsequence-queries for the factors of a given string can be efficiently computed.

翻译：字符串 $w$ 的缺省因素是字符串 $u 元的缺省因素。字符串美元的缺省因素是美元美元内美元内字符串美元的字符串美元的美元, 美元美元内的字符串美元的字符串美元美元的字符串美元美元的字符串美元美元的字符串美元的字符串美元, 美元美元美元美元美元内美元的字符串的美元美元, 美元美元美元内美元内的字符串, 美元内内内内的数数数内数数的数数数数数数。我们特别感兴趣的是少的的后后, 后,, 即少少后后后数数,, 少, 少数的。我们如何测试的短。