The anti-lexicographic SUS-anchor achieves sampling densities less than 1% above the lower bound for alphabet size 4 and k=1, substantially outperforming bidirectional anchors.
Longest Common Subsequence in k-length substrings
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
In this paper we define a new problem, motivated by computational biology, $LCSk$ aiming at finding the maximal number of $k$ length $substrings$, matching in both input strings while preserving their order of appearance. The traditional LCS definition is a special case of our problem, where $k = 1$. We provide an algorithm, solving the general case in $O(n^2)$ time, where $n$ is the length of the input strings, equaling the time required for the special case of $k=1$. The space requirement of the algorithm is $O(kn)$. %, however, in order to enable %backtracking of the solution, $O(n^2)$ space is needed. We also define a complementary $EDk$ distance measure and show that $EDk(A,B)$ can be computed in $O(nm)$ time and $O(km)$ space, where $m$, $n$ are the lengths of the input sequences $A$ and $B$ respectively.
fields
cs.DS 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
The anti-lexicographic SUS-anchor: a near-optimal k=1 sampling scheme
The anti-lexicographic SUS-anchor achieves sampling densities less than 1% above the lower bound for alphabet size 4 and k=1, substantially outperforming bidirectional anchors.