The longest common subsequence (LCS) is a fundamental problem in string processing which has numerous algorithmic studies, extensions, and applications. A sequence $u_1, \ldots, u_f$ of $f$ strings s said to be an ($f$-)segmentation of a string $P$ if $P = u_1 \cdots u_f$. Li et al. [BIBM 2022] proposed a new variant of the LCS problem for given strings $T_1, T_2$ and an integer $f$, which we hereby call the segmental LCS problem (SegLCS), of finding (the length of) a longest string $P$ that has an $f$-segmentation which can be embedded into both $T_1$ and $T_2$. Li et al. [IJTCS-FAW 2024] gave a dynamic programming solution that solves SegLCS in $O(fn_1n_2)$ time with $O(fn_1 + n_2)$ space, where $n_1 = |T_1|$, $n_2 = |T_2|$, and $n_1 \le n_2$. Recently, Banerjee et al. [ESA 2024] presented an algorithm which, for a constant $f \geq 3$, solves SegLCS in $\tilde{O}((n_1n_2)^{1-(1/3)^{f-2}})$ time. In this paper, we deal with SegLCS as well as the problem of segmental subsequence pattern matching, SegE, that asks to determine whether a pattern $P$ of length $m$ has an $f$-segmentation that can be embedded into a text $T$ of length $n$. When $f = 1$, this is equivalent to substring matching, and when $f = |P|$, this is equivalent to subsequence matching. Our focus in this article is the case of general values of $f$, and our main contributions are threefold: (1) $O((mn)^{1-\epsilon})$-time conditional lower bound for SegE under the strong exponential-time hypothesis (SETH), for any constant $\epsilon > 0$. (2) $O(mn)$-time algorithm for SegE. (3) $O(fn_2(n_1 - \ell+1))$-time algorithm for SegLCS where $\ell$ is the solution length.
翻译:暂无翻译