In Pattern Matching with Weighted Edits (PMWED), we are given a pattern $P$ of length $m$, a text $T$ of length $n$, a positive threshold $k$, and oracle access to a weight function that specifies the costs of edits (depending on the involved characters, and normalized so that the cost of each edit is at least $1$). The goal is to compute the starting positions of all fragments of $T$ that can be obtained from $P$ with edits of total cost at most $k$. PMWED captures typical real-world applications more accurately than its unweighted variant (PMED), where all edits have unit costs. We obtain three main results: (a) a conceptually simple $\tilde{O}(nk)$-time algorithm for PMWED, very different from that of Landau and Vishkin for PMED; (b) a significantly more complicated $\tilde{O}(n+k^{3.5} \cdot W^4\cdot n/m)$-time algorithm for PMWED under the assumption that the weight function is a metric with integer values between $0$ and $W$; and (c) an $\tilde{O}(n+k^4 \cdot n/m)$-time algorithm for PMWED for the case of arbitrary weights. In the setting of metrics with small integer values, we nearly match the state of the art for PMED where $W=1$.
翻译:暂无翻译