pith. sign in

arxiv: 1204.5524 · v3 · pith:6E4MQ5KSnew · submitted 2012-04-25 · 💻 cs.DS

Time and Space Efficient Lempel-Ziv Factorization based on Run Length Encoding

classification 💻 cs.DS
keywords timespacestringalgorithmsextraalgorithmencodingfactorization
0
0 comments X
read the original abstract

We propose a new approach for calculating the Lempel-Ziv factorization of a string, based on run length encoding (RLE). We present a conceptually simple off-line algorithm based on a variant of suffix arrays, as well as an on-line algorithm based on a variant of directed acyclic word graphs (DAWGs). Both algorithms run in $O(N+n\log n)$ time and O(n) extra space, where N is the size of the string, $n\leq N$ is the number of RLE factors. The time dependency on N is only in the conversion of the string to RLE, which can be computed very efficiently in O(N) time and O(1) extra space (excluding the output). When the string is compressible via RLE, i.e., $n = o(N)$, our algorithms are, to the best of our knowledge, the first algorithms which require only o(N) extra space while running in $o(N\log N)$ time.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.