Let two static sequences of strings $P$ and $S$, representing prefix and suffix conditions respectively, be given as input for preprocessing. For the query, let two positive integers $k_1$ and $k_2$ be given, as well as a string $T$ given in an online manner, such that $T_i$ represents the length-$i$ prefix of $T$ for $1 \leq i \leq |T|$. In this paper we are interested in computing the set $\mathit{ans_i}$ of distinct substrings $w$ of $T_i$ such that $k_1 \leq |w| \leq k_2$, and $w$ contains some $p \in P$ as a prefix and some $s \in S$ as a suffix. Let $\sigma$ denote the alphabet size. Then, we show that after $O((\Vert P\Vert +\Vert S\Vert)\log\sigma)$-time preprocessing, the counting problem of outputting $|\mathit{ans_i}|$ on each iteration $i$ can be solved in $O(|T_i| \log\sigma)$ cumulative time, while the reporting problem can be solved in $O(|T_i| \log\sigma + |\mathit{ans_i}|)$ cumulative time, with both problems requiring only $O(|T_i|+\Vert P\Vert + \Vert S\Vert)$ working space. The preprocessing time can be reduced to $O(\Vert P\Vert +\Vert S\Vert)$ for integer alphabets of size polynomial with regard to $\Vert P\Vert +\Vert S\Vert$. Here, for a sequence of strings $A$, $\Vert A\Vert=\sum_{u\in A}|u|$. Our algorithms have possible applications to network traffic classification.
翻译:VP$ 和 $S$ 等两个固定的字符串序列, 分别代表前置和后端条件。 对于查询, 给两个正整数$k_ 1美元和 $k_ 2美元, 以及一个以在线方式提供的字符串$T, 因此$_ i 代表美元长度- i 美元前缀, 美元为 1\ leq i\ ⁇ T% 美元 。 在本文中, 我们有兴趣计算 $( vathit{ ans_ i} 美元前置 美元前置 。 ) A_ t_ 美元前置 美元前置 美元前置 美元前置 美元前置 美元前置 美元前置 美元前置 美元前置 美元前置 。 美元后, 我们只能计算 $( vert) list PVT$ 的运行成本 美元后, 美元后S =_ Q_ Q_ 美元内调 。