We consider approximate circular pattern matching (CPM, in short) under the Hamming and edit distance, in which we are given a length-$n$ text $T$, a length-$m$ pattern $P$, and a threshold $k>0$, and we are to report all starting positions of fragments of $T$ (called occurrences) that are at distance at most $k$ from some cyclic rotation of $P$. In the decision version of the problem, we are to check if any such occurrence exists. All previous results for approximate CPM were either average-case upper bounds or heuristics, except for the work of Charalampopoulos et al. [CKP$^+$, JCSS'21], who considered only the Hamming distance. For the reporting version of the approximate CPM problem, under the Hamming distance we improve upon the main algorithm of [CKP$^+$, JCSS'21] from ${\cal O}(n+(n/m)\cdot k^4)$ to ${\cal O}(n+(n/m)\cdot k^3\log\log k)$ time; for the edit distance, we give an ${\cal O}(nk^2)$-time algorithm. We also consider the decision version of the approximate CPM problem. Under the Hamming distance, we obtain an ${\cal O}(n+(n/m)\cdot k^2\log k/\log\log k)$-time algorithm, which nearly matches the algorithm by Chan et al. [CGKKP, STOC'20] for the standard counterpart of the problem. Under the edit distance, the ${\cal O}(nk\log^3 k)$ runtime of our algorithm nearly matches the ${\cal O}(nk)$ runtime of the Landau-Vishkin algorithm [LV, J. Algorithms'89]. As a stepping stone, we propose ${\cal O}(nk\log^3 k)$-time algorithm for the Longest Prefix $k'$-Approximate Match problem, proposed by Landau et al. [LMS, SICOMP'98], for all $k'\in \{1,\dots,k\}$. We give a conditional lower bound that suggests a polynomial separation between approximate CPM under the Hamming distance over the binary alphabet and its non-circular counterpart. We also show that a strongly subquadratic-time algorithm for the decision version of approximate CPM under edit distance would refute SETH.
翻译:我们考虑在 Hamming (CPM, 简称) 和编辑距离下大致的圆形匹配 (CPM, 缩略) (CPM, 缩略) 。 大约的圆形匹配( 缩略) (CPM, 缩略) (KPM, 缩略) (KP, 缩略) (kk3, 缩略) 所有之前的结果要么是平均- 美元( T$) 文本 $T$, 一个长度- 美元模式 $P$, 一个长度- 美元模式 $P$, 一个门槛值( JCS'21) 。 对于大约为$P$( $, JCS' ) 的碎片启动位置, 我们从 $( O) (n) (n) 和 cockt k%4) 的运行结果, 直径(n) 直径(n) 直径/ klog) 运行。