pith. sign in

arxiv: 1801.09159 · v2 · pith:UP6TVB5Inew · submitted 2018-01-28 · 💻 cs.DS

Faster Approximate(d) Text-to-Pattern L1 Distance

classification 💻 cs.DS
keywords distancewidetildeemphsqrtvarepsilonalgorithmsapproximatedistances
0
0 comments X
read the original abstract

The problem of finding \emph{distance} between \emph{pattern} of length $m$ and \emph{text} of length $n$ is a typical way of generalizing pattern matching to incorporate dissimilarity score. For both Hamming and $L_1$ distances only a super linear upper bound $\widetilde{O}(n\sqrt{m})$ are known, which prompts the question of relaxing the problem: either by asking for $(1 \pm \varepsilon)$ approximate distance (every distance is reported up to a multiplicative factor), or $k$-approximated distance (distances exceeding $k$ are reported as $\infty$). We focus on $L_1$ distance, for which we show new algorithms achieving complexities respectively $\widetilde{O}(\varepsilon^{-1} n)$ and $\widetilde{O}((m+k\sqrt{m}) \cdot n/m)$. This is a significant improvement upon previous algorithms with runtime $\widetilde{O}(\varepsilon^{-2} n)$ of Lipsky and Porat [Algorithmica 2011] and $\widetilde{O}(n\sqrt{k})$ of Amir, Lipsky, Porat and Umanski [CPM 2005].

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.