m-Bonsai: a Practical Compact Dynamic Trie
read the original abstract
We consider the problem of implementing a space-efficient dynamic trie, with an emphasis on good practical performance. For a trie with $n$ nodes with an alphabet of size $\sigma$, the information-theoretic lower bound is $n \log \sigma + O(n)$ bits. The Bonsai data structure is a compact trie proposed by Darragh et al. (Softw., Pract. Exper. 23(3), 1993, p. 277-291). Its disadvantages include the user having to specify an upper bound $M$ on the trie size in advance (which cannot be changed easily after initalization), a space usage of $M \log \sigma + O(M \log \log M)$ (which is asymptotically non-optimal for smaller $\sigma$ or if $n \ll M$) and a lack of support for deletions. It supports traversal and update operations in $O(1/\epsilon)$ expected time (based on assumptions about the behaviour of hash functions), where $\epsilon = (M-n)/M$ and has excellent speed performance in practice. We propose an alternative, m-Bonsai, that addresses the above problems, obtaining a trie that uses $(1+\beta) n (\log \sigma + O(1))$ bits in expectation, and supports traversal and update operations in $O(1/\beta)$ expected time and $O(1/\beta^2)$ amortized expected time, for any user-specified parameter $\beta > 0$ (again based on assumptions about the behaviour of hash functions). We give an implementation of m-Bonsai which uses considerably less memory and is slightly faster than the original Bonsai.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.