PivCo-Huffman

Marcin Zukowski

arxiv: 2606.05765 · v1 · pith:FNL76BF3new · submitted 2026-06-04 · 💻 cs.DS · cs.PF

PivCo-Huffman

Marcin Zukowski This is my paper

Pith reviewed 2026-06-27 23:36 UTC · model grok-4.3

classification 💻 cs.DS cs.PF

keywords Huffman codingwavelet treesSIMDANS codingdata compressiondecoding throughputpivot structure

0 comments

The pith

PivCo-Huffman adapts wavelet tree pivots to Huffman trees for SIMD-friendly operations and higher decoding throughput.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes PivCo-Huffman as a new Huffman coding approach that borrows the pivot structure from wavelet trees. This produces a tree layout that supports fast, vectorized encoding and decoding operations. Experiments show it delivers higher decoding throughput than existing Huffman implementations. The same layout also permits selective use of ANS coding on skewed nodes, producing compression ratios that approach those of ANS codecs while retaining very high decompression speeds. Readers would care because Huffman coding remains central to many compression systems, so gains in speed or flexible ratio-speed tradeoffs affect practical performance.

Core claim

Mapping the pivot structure from wavelet trees onto Huffman trees yields a layout that supports efficient SIMD vectorized encoding and decoding operations, resulting in higher decoding throughput than state-of-the-art Huffman codecs, while also enabling selective application of ANS coding to skewed nodes to reach compression ratios approaching those of ANS-based codecs without sacrificing decompression speed.

What carries the argument

The pivot-coded Huffman structure that organizes Huffman trees using wavelet-tree pivots to support vectorized access patterns.

If this is right

Decoding throughput exceeds that of state-of-the-art Huffman codecs.
Encoding and decoding operations become amenable to SIMD vectorization.
Selective ANS coding on skewed nodes produces compression ratios approaching ANS-based methods.
The hybrid structure maintains very high decompression speeds alongside the improved ratios.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The pivot layout might be adapted to other tree-based prefix codes beyond Huffman.
Hardware-specific tuning of pivot placement could further increase SIMD utilization on particular processors.
The approach suggests a general route for blending wavelet-style navigation with entropy coding for new data types.

Load-bearing premise

The wavelet-tree pivot structure can be mapped onto Huffman trees without unacceptable overhead in construction time, memory use, or compression ratio, while keeping the layout suitable for efficient SIMD vectorization on real hardware.

What would settle it

A side-by-side benchmark on standard corpora showing that PivCo-Huffman decoding throughput does not exceed that of the best prior Huffman codecs.

Figures

Figures reproduced from arXiv: 2606.05765 by Marcin Zukowski.

**Figure 1.** Figure 1: Classical Huffman tree for the word “huffman” node = root while not is_leaf(node) if read_bit() == 1: node = node->right else: node = node->left return node.symbol Listing 1: Naive Huffman decoding for one symbol In classical Huffman coding, each symbol is encoded using a code (a sequence of bits), with more frequent symbols getting shorter codes [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 2.** Figure 2: Example of pivoting a Huffmanencoded string. All bits sharing the same prefix (noted in quotes) are grouped together as a bitmap. Color-coded letters with subscript denote which-letter which-bit combinations. * marks a code-terminal bit. 1 0 0 1 0 2 0 3 1 4 1 5 1 6 1 1 0 2 0 3 0 0 1 4 0 5 1 6 1 0 0 5 0 4 1 6 f u a h m n h 1 h0 0 h1 1 h2 u 0 u0 1 u1 f 0 f0 0 f1 f 0 f0 0 f1 m 1 m0 1 m1 0 m2 a 1 a0 0 a1 0 a2… view at source ↗

**Figure 4.** Figure 4: Decoding-strategy for a naive Huffman tree Code Explanation P partition - split indices into left/right based on bitmap PR partition_root - like partition but for the root node PH partition_half - like partition, but produces only one output C not a primitive - marks the “constant”, top-frequency key S1 scatter - scatters a single symbol into output S2 scatter_two - scatters two symbols into output SFD sca… view at source ↗

**Figure 5.** Figure 5: shows the benefit in reduced operations per decoded byte going from 4.071 to 3.286. 2.2.2. Frequent symbol optimization One of the problems of our decoding primitives is writing into non-contiguous positions in the output, presenting challenges for modern CPUs and memory subsystems (see Section 2.4.2). We can mitigate it by avoiding this completely for the most frequent symbol, by simply prefilling the ent… view at source ↗

**Figure 7.** Figure 7: Detecting “flat” subtrees D=2 D=2 PR SF2 PH SF2 a C c o p y - n t u coconut-papaya — optimized subtrees ops/byte: 2.286 [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

**Figure 9.** Figure 9: Bottom-up traversed “huffman” tree (flat trees off). See how each node produces a dense list of symbols. This results in the approach presented in [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗

**Figure 10.** Figure 10: Bottom-up tree operations (optimized flat trees off) [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗

**Figure 11.** Figure 11: PivCo-Huffman decoding performance with different tree optimization levels [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗

**Figure 12.** Figure 12: Encoding tree operations (optimized flat trees off) Symbol Explanation EPF enc_partition_full - create a bitmap, split codes into left/right outputs EPL enc_partition_left - right child is a leaf, only produce bitmap+left codes EPR enc_partition_right - left child is a leaf, only produce bitmap+right codes EPN enc_partition_none - both children are leaves, only produce bitmap PKN packN - pack codes into … view at source ↗

**Figure 13.** Figure 13: Encoding throughput, M4 (top) and c8i (bottom) — the data of [PITH_FULL_IMAGE:figures/full_fig_p018_13.png] view at source ↗

**Figure 14.** Figure 14: Skew visualization for calgary_pic (top part of the tree only) 51.1/48.9 100% 48.3/51.7 51% 48.9/51.1 49% 94.3/5.7 25% 99.0/1.0 1.4% 52.2/47.8 0% 44.4/55.6 0% 48.5/51.5 0% 56.3/43.8 0% 55.0/45.0 0% 50.0/50.0 0% 47.1/52.9 0% 44.4/55.6 0% 57.1/42.9 0% 54.5/45.5 0% 44.4/55.6 0% 50.0/50.0 0% 50.0/50.0 0% 55.6/44.4 0% 50.0/50.0 0% 60.0/40.0 0% 50.0/50.0 0% 60.0/40.0 0% 50.0/50.0 0% 50.0/50.0 0% 50.0/50.0 0% 50… view at source ↗

**Figure 16.** Figure 16: Decode throughput per engine, M4 (top) and c8i (bottom) — the bandwidth columns of [PITH_FULL_IMAGE:figures/full_fig_p022_16.png] view at source ↗

**Figure 17.** Figure 17: PivCo-Huffman performance over Huff0 on 3 CPU families on AWS (across datasets from [PITH_FULL_IMAGE:figures/full_fig_p024_17.png] view at source ↗

read the original abstract

Huffman encoding has been an enduring technique for 70+ years, ubiquitous in compression algorithms since its invention. In this paper we propose a new approach to Huffman coding, based on a data structure from wavelet trees. The resulting pivot-coded Huffman (PivCo-Huffman) enables high-performance SIMD-friendly encoding and decoding operations. In our tests PivCo-Huffman consistently outperforms state-of-the-art Huffman codecs in decoding throughput. Additionally, we show how ANS-coding can be selectively applied to skewed nodes in this structure, yielding compression ratios approaching those of ANS-based codecs while preserving very high decompression speeds.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PivCo-Huffman sketches a wavelet-pivot layout for Huffman that targets SIMD decoding plus selective ANS, but the abstract supplies zero measurements so the claims stay untestable.

read the letter

The core idea is to borrow pivot structures from wavelet trees to reorganize a Huffman tree so that the decoding walk becomes more regular and vector-friendly, then bolt ANS onto the most skewed leaves. That specific combination does not appear in the cited prior work, so the construction itself counts as new.

The framing is practical: Huffman still shows up in many pipelines where raw speed matters more than ultimate ratio, and anything that keeps the decode path branch-free and load-friendly could help throughput. The selective ANS graft is a reasonable way to recover some compression without giving up the fast path on the rest of the tree.

The obvious gap is the complete absence of numbers. The abstract asserts consistent outperformance in decoding throughput yet shows no baselines, no hardware, no access-pattern counts, and no construction-time or memory figures. Without those, it is impossible to know whether the pivot arrays actually produce contiguous vector loads or whether they introduce the pointer chasing the stress-test note flags. If the layout forces scattered accesses, the SIMD premise collapses even if the tree can be built.

Construction overhead and any change in final ratio are also left unmentioned, both of which matter for adoption. The paper therefore reads as a proposal whose central engineering claim cannot yet be evaluated.

It is aimed at people who maintain high-throughput compression inside databases or analytics engines. A reader already working on codec internals might extract the pivot-mapping idea and try it, but the current text does not give enough to reproduce or cite the result.

The thinking is coherent on its own terms, so the work deserves a referee who can check the full experiments and any code or data that exist.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes PivCo-Huffman, a Huffman coding variant that grafts wavelet-tree pivot structures onto Huffman trees. It claims this layout enables SIMD-friendly encoding/decoding operations, yields higher decoding throughput than existing Huffman codecs, and permits selective ANS coding on skewed nodes to reach compression ratios approaching those of ANS codecs while retaining high decompression speed.

Significance. If the mapping produces a genuinely branch-free, vector-load-friendly layout with no material overhead in construction time, memory, or ratio, the result would be a useful data-structure contribution for high-throughput decompression on SIMD hardware. The hybrid Huffman-plus-selective-ANS idea is also potentially practical.

major comments (2)

[Abstract] Abstract: the central empirical claim that PivCo-Huffman 'consistently outperforms state-of-the-art Huffman codecs in decoding throughput' is stated without any measurements, baselines, error bars, or implementation details, so the claim cannot be assessed.
[Abstract] Abstract (and implied § on construction): the claim that the wavelet-tree pivot layout produces a memory layout whose decoding walk is branch-free and vector-load friendly is asserted without quantifying random memory accesses per symbol, pointer-chasing depth, or alignment properties of the pivot arrays. This directly affects whether the SIMD advantage can materialize on real hardware.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for these focused comments on the abstract and the description of the memory layout. We agree that both points identify areas where the manuscript can be strengthened by making empirical support and structural properties more explicit. We will revise the abstract and add the requested analysis in the construction and decoding sections.

read point-by-point responses

Referee: [Abstract] Abstract: the central empirical claim that PivCo-Huffman 'consistently outperforms state-of-the-art Huffman codecs in decoding throughput' is stated without any measurements, baselines, error bars, or implementation details, so the claim cannot be assessed.

Authors: We agree that the abstract presents the throughput claim without supporting numbers. The experiments section reports concrete throughput figures against multiple Huffman baselines (including canonical Huffman and other SIMD variants), with repeated runs providing standard deviations. We will revise the abstract to include a concise summary of these results (e.g., average speedup and primary baselines) so the claim can be assessed without reading further. revision: yes
Referee: [Abstract] Abstract (and implied § on construction): the claim that the wavelet-tree pivot layout produces a memory layout whose decoding walk is branch-free and vector-load friendly is asserted without quantifying random memory accesses per symbol, pointer-chasing depth, or alignment properties of the pivot arrays. This directly affects whether the SIMD advantage can materialize on real hardware.

Authors: The manuscript explains that the pivot arrays replace traditional tree traversal with fixed-position array accesses, enabling branch-free SIMD loads. However, we acknowledge that explicit metrics for random accesses per symbol, pointer-chasing depth, and alignment are not quantified. We will add a short analysis subsection reporting these properties (average random accesses, maximum depth, and 64-byte alignment of pivot arrays) to substantiate the SIMD claim. revision: yes

Circularity Check

0 steps flagged

No circularity: proposal is a structural mapping with no self-referential equations or fitted predictions

full rationale

The paper introduces a wavelet-tree pivot layout grafted onto Huffman trees as a new encoding structure. No equations, fitted parameters, or predictions appear in the provided abstract or description. The central claim is an empirical performance assertion about SIMD throughput and selective ANS application; it does not reduce any derived quantity back to its own inputs by construction. No self-citation load-bearing steps or ansatz smuggling are visible. The derivation chain is therefore self-contained as a design proposal rather than a closed mathematical reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; ledger is empty by default.

pith-pipeline@v0.9.1-grok · 5607 in / 899 out tokens · 14092 ms · 2026-06-27T23:36:04.877085+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references · 12 canonical work pages · 3 internal anchors

[1]

A Method for the Construction of Minimum-Redundancy Codes,

D. A. Huffman, “A Method for the Construction of Minimum-Redundancy Codes, ” Proceedings of the IRE, vol. 40, no. 9, pp. 1098–1101, 1952, doi: 10.1109/JRPROC.1952.273898

work page doi:10.1109/jrproc.1952.273898 1952
[2]

On the Implementation of Minimum Redundancy Prefix Codes,

A. Moffat and A. Turpin, “On the Implementation of Minimum Redundancy Prefix Codes, ” IEEE Transactions on Communications, vol. 45, no. 10, pp. 1200–1207, 1997, doi: 10.1109/26.634683

work page doi:10.1109/26.634683 1997
[3]

Arithmetic coding for data compression,

I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data compression, ” Communica- tions of the ACM, vol. 30, no. 6, pp. 520–540, 1987, doi: 10.1145/214762.214771

work page doi:10.1145/214762.214771 1987
[4]

Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding

J. Duda, “Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding. ” [Online]. Available: https://arxiv.org/abs/1311.2540

work page internal anchor Pith review Pith/arXiv arXiv
[5]

Generating a canonical prefix encoding,

E. S. Schwartz and B. Kallick, “Generating a canonical prefix encoding, ” Communications of the ACM, vol. 7, no. 3, pp. 166–169, 1964, doi: 10.1145/363958.363991

work page doi:10.1145/363958.363991 1964
[6]

Interleaved entropy coders

F. Giesen, “Interleaved entropy coders. ” [Online]. Available: https://arxiv.org/abs/1402.3392

work page internal anchor Pith review Pith/arXiv arXiv
[7]

Entropy decoding in Oodle Data: x86-64 6-stream Huffman decoders

F. Giesen, “Entropy decoding in Oodle Data: x86-64 6-stream Huffman decoders. ” [Online]. Available: https://fgiesen.wordpress.com/2023/10/29/entropy-decoding-in-oodle-data-x86-64-6- stream-huffman-decoders/

2023
[8]

FiniteStateEntropy: New Generation Entropy coders)

Y. Collet, “FiniteStateEntropy: New Generation Entropy coders). ” [Online]. Available: https:// github.com/cyan4973/FiniteStateEntropy
[9]

zstd: A real-time lossless data compression library

Y. Collet and others, “zstd: A real-time lossless data compression library. ” [Online]. Available: https://github.com/facebook/zstd
[10]

Entropy coding in Oodle Data: Huffman coding

F. Giesen, “Entropy coding in Oodle Data: Huffman coding. ” [Online]. Available: https://fgiesen. wordpress.com/2021/08/30/entropy-coding-in-oodle-data-huffman-coding/

2021
[11]

Balancing Vectorized Query Execution with Bandwidth-Optimized Storage,

M. Zukowski, “Balancing Vectorized Query Execution with Bandwidth-Optimized Storage, ” Doc- toral dissertation, 2009. [Online]. Available: https://pure.uva.nl/ws/files/942072/72163_thesis.pdf

2009
[12]

High-order entropy-compressed text indexes,

R. Grossi, A. Gupta, and J. S. Vitter, “High-order entropy-compressed text indexes, ” in Proc. SODA 2003, SIAM, 2003, pp. 841–850. [Online]. Available: https://dl.acm.org/doi/10.5555/644108.644250

work page doi:10.5555/644108.644250 2003
[13]

Bit-Parallel (Compressed) Wavelet Tree Construction,

P. Dinklage, J. Fischer, F. Kurpicz, and J.-P. Tarnowski, “Bit-Parallel (Compressed) Wavelet Tree Construction, ” in Proc. DCC 2023 , 2023. [Online]. Available: https://www.kurpicz.org/assets/ publications/dcc_2023.pdf

2023
[14]

Range ANS (rANS) - faster direct replacement for range coding

“Range ANS (rANS) - faster direct replacement for range coding. ” [Online]. Available: https:// encode.su/threads/1870-Range-ANS-(rANS)-faster-direct-replacement-for-range-coding
[15]

Oodle Data Compression Suite

RAD Game Tools (now Epic), “Oodle Data Compression Suite. ” 2025

2025
[16]

Neoverse V1 platform

Arm Limited, “Neoverse V1 platform. ” [Online]. Available: https://www.arm.com/products/ silicon-ip-cpu/neoverse/neoverse-v1
[17]

Run-length encodings,

S. W. Golomb, “Run-length encodings, ” IEEE Transactions on Information Theory , vol. 12, no. 3, pp. 399–401, 1966

1966
[18]

Some Practical Universal Noiseless Coding Techniques,

R. F. Rice, “Some Practical Universal Noiseless Coding Techniques, ” technical report JPL-79–22, 1979

1979
[19]

Partitioned Elias-Fano indexes,

G. Ottaviano and R. Venturini, “Partitioned Elias-Fano indexes, ” in Proc. SIGIR 2014, 2014

2014
[20]

Synthesis of noiseless compression codes,

B. P. Tunstall, “Synthesis of noiseless compression codes, ” Doctoral dissertation, 1967

1967
[21]

Compressing relations and indexes,

J. Goldstein, R. Ramakrishnan, and U. Shaft, “Compressing relations and indexes, ” in Proc. 14th Intl. Conf. on Data Engineering (ICDE '98), IEEE Computer Society, 1998, pp. 370–379. doi: 10.1109/ ICDE.1998.655800

work page arXiv 1998
[22]

Stream VByte: breaking new speed records for integer compression

D. Lemire, “Stream VByte: breaking new speed records for integer compression. ” [Online]. Avail- able: https://lemire.me/blog/2017/09/27/stream-vbyte-breaking-new-speed-records-for-integer- compression/

2017
[23]

Sayood, Introduction to Data Compression, 5th ed

K. Sayood, Introduction to Data Compression, 5th ed. Morgan Kaufmann, 2017

2017
[24]

The myriad virtues of Wavelet Trees,

P. Ferragina, R. Giancarlo, and G. Manzini, “The myriad virtues of Wavelet Trees, ” Information and Computation, vol. 207, no. 8, pp. 849–866, 2009, [Online]. Available: https://www.sciencedirect. com/science/article/pii/S0890540108001594

2009
[25]

Practical Wavelet Tree Construction,

P. Dinklage, J. Ellert, J. Fischer, F. Kurpicz, and M. Löbel, “Practical Wavelet Tree Construction, ” ACM Journal of Experimental Algorithmics (JEA) , vol. 26, no. 1, pp. Article1.8, 67pp., 2021, doi: 10.1145/3457197

work page doi:10.1145/3457197 2021
[26]

Succinct Indexable Dictionaries with Applications to Encoding $k$-ary Trees, Prefix Sums and Multisets

R. Raman, V. Raman, and S. R. Satti, “Succinct Indexable Dictionaries with Applications to Encoding k-ary Trees, Prefix Sums and Multisets, ” ACM Transactions on Algorithms, vol. 3, no. 4, p. Article43, 2007, [Online]. Available: https://arxiv.org/abs/0705.0552

work page internal anchor Pith review Pith/arXiv arXiv 2007
[27]

Super-Scalar RAM-CPU Cache Compression,

M. Zukowski, S. Héman, N. Nes, and P. Boncz, “Super-Scalar RAM-CPU Cache Compression, ” in Proc. 22nd Intl. Conf. on Data Engineering (ICDE '06), IEEE Computer Society, 2006. doi: 10.1109/ ICDE.2006.150

2006
[28]

Decoding billions of integers per second through vectorization,

D. Lemire and L. Boytsov, “Decoding billions of integers per second through vectorization, ” Software: Practice and Experience, vol. 45, no. 1, pp. 1–29, 2015, doi: 10.1002/spe.2203

work page doi:10.1002/spe.2203 2015
[29]

The FastLanes Compression Layout: Decoding >100 Billion Integers per Second with Scalar Code,

A. Afroozeh and P. Boncz, “The FastLanes Compression Layout: Decoding >100 Billion Integers per Second with Scalar Code, ” Proceedings of the VLDB Endowment, vol. 16, no. 9, pp. 2132–2144, 2023, doi: 10.14778/3598581.3598587

work page doi:10.14778/3598581.3598587 2023
[30]

Simple batch decoding of unary codes

F. Giesen, “Simple batch decoding of unary codes. ” [Online]. Available: https://fgiesen.wordpress. com/2026/05/30/simple-batch-decoding-of-unary-codes/ Appendices A Datasets Name #syms H (bits) Huffman (bits) min max Source / description calgary_pic 159 1.210 1.690 1 11 Calgary Corpus 1bpp CCITT scanned page, proba80-like real data (calgary_pic) chinese_...

2026

[1] [1]

A Method for the Construction of Minimum-Redundancy Codes,

D. A. Huffman, “A Method for the Construction of Minimum-Redundancy Codes, ” Proceedings of the IRE, vol. 40, no. 9, pp. 1098–1101, 1952, doi: 10.1109/JRPROC.1952.273898

work page doi:10.1109/jrproc.1952.273898 1952

[2] [2]

On the Implementation of Minimum Redundancy Prefix Codes,

A. Moffat and A. Turpin, “On the Implementation of Minimum Redundancy Prefix Codes, ” IEEE Transactions on Communications, vol. 45, no. 10, pp. 1200–1207, 1997, doi: 10.1109/26.634683

work page doi:10.1109/26.634683 1997

[3] [3]

Arithmetic coding for data compression,

I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data compression, ” Communica- tions of the ACM, vol. 30, no. 6, pp. 520–540, 1987, doi: 10.1145/214762.214771

work page doi:10.1145/214762.214771 1987

[4] [4]

Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding

J. Duda, “Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding. ” [Online]. Available: https://arxiv.org/abs/1311.2540

work page internal anchor Pith review Pith/arXiv arXiv

[5] [5]

Generating a canonical prefix encoding,

E. S. Schwartz and B. Kallick, “Generating a canonical prefix encoding, ” Communications of the ACM, vol. 7, no. 3, pp. 166–169, 1964, doi: 10.1145/363958.363991

work page doi:10.1145/363958.363991 1964

[6] [6]

Interleaved entropy coders

F. Giesen, “Interleaved entropy coders. ” [Online]. Available: https://arxiv.org/abs/1402.3392

work page internal anchor Pith review Pith/arXiv arXiv

[7] [7]

Entropy decoding in Oodle Data: x86-64 6-stream Huffman decoders

F. Giesen, “Entropy decoding in Oodle Data: x86-64 6-stream Huffman decoders. ” [Online]. Available: https://fgiesen.wordpress.com/2023/10/29/entropy-decoding-in-oodle-data-x86-64-6- stream-huffman-decoders/

2023

[8] [8]

FiniteStateEntropy: New Generation Entropy coders)

Y. Collet, “FiniteStateEntropy: New Generation Entropy coders). ” [Online]. Available: https:// github.com/cyan4973/FiniteStateEntropy

[9] [9]

zstd: A real-time lossless data compression library

Y. Collet and others, “zstd: A real-time lossless data compression library. ” [Online]. Available: https://github.com/facebook/zstd

[10] [10]

Entropy coding in Oodle Data: Huffman coding

F. Giesen, “Entropy coding in Oodle Data: Huffman coding. ” [Online]. Available: https://fgiesen. wordpress.com/2021/08/30/entropy-coding-in-oodle-data-huffman-coding/

2021

[11] [11]

Balancing Vectorized Query Execution with Bandwidth-Optimized Storage,

M. Zukowski, “Balancing Vectorized Query Execution with Bandwidth-Optimized Storage, ” Doc- toral dissertation, 2009. [Online]. Available: https://pure.uva.nl/ws/files/942072/72163_thesis.pdf

2009

[12] [12]

High-order entropy-compressed text indexes,

R. Grossi, A. Gupta, and J. S. Vitter, “High-order entropy-compressed text indexes, ” in Proc. SODA 2003, SIAM, 2003, pp. 841–850. [Online]. Available: https://dl.acm.org/doi/10.5555/644108.644250

work page doi:10.5555/644108.644250 2003

[13] [13]

Bit-Parallel (Compressed) Wavelet Tree Construction,

P. Dinklage, J. Fischer, F. Kurpicz, and J.-P. Tarnowski, “Bit-Parallel (Compressed) Wavelet Tree Construction, ” in Proc. DCC 2023 , 2023. [Online]. Available: https://www.kurpicz.org/assets/ publications/dcc_2023.pdf

2023

[14] [14]

Range ANS (rANS) - faster direct replacement for range coding

“Range ANS (rANS) - faster direct replacement for range coding. ” [Online]. Available: https:// encode.su/threads/1870-Range-ANS-(rANS)-faster-direct-replacement-for-range-coding

[15] [15]

Oodle Data Compression Suite

RAD Game Tools (now Epic), “Oodle Data Compression Suite. ” 2025

2025

[16] [16]

Neoverse V1 platform

Arm Limited, “Neoverse V1 platform. ” [Online]. Available: https://www.arm.com/products/ silicon-ip-cpu/neoverse/neoverse-v1

[17] [17]

Run-length encodings,

S. W. Golomb, “Run-length encodings, ” IEEE Transactions on Information Theory , vol. 12, no. 3, pp. 399–401, 1966

1966

[18] [18]

Some Practical Universal Noiseless Coding Techniques,

R. F. Rice, “Some Practical Universal Noiseless Coding Techniques, ” technical report JPL-79–22, 1979

1979

[19] [19]

Partitioned Elias-Fano indexes,

G. Ottaviano and R. Venturini, “Partitioned Elias-Fano indexes, ” in Proc. SIGIR 2014, 2014

2014

[20] [20]

Synthesis of noiseless compression codes,

B. P. Tunstall, “Synthesis of noiseless compression codes, ” Doctoral dissertation, 1967

1967

[21] [21]

Compressing relations and indexes,

J. Goldstein, R. Ramakrishnan, and U. Shaft, “Compressing relations and indexes, ” in Proc. 14th Intl. Conf. on Data Engineering (ICDE '98), IEEE Computer Society, 1998, pp. 370–379. doi: 10.1109/ ICDE.1998.655800

work page arXiv 1998

[22] [22]

Stream VByte: breaking new speed records for integer compression

D. Lemire, “Stream VByte: breaking new speed records for integer compression. ” [Online]. Avail- able: https://lemire.me/blog/2017/09/27/stream-vbyte-breaking-new-speed-records-for-integer- compression/

2017

[23] [23]

Sayood, Introduction to Data Compression, 5th ed

K. Sayood, Introduction to Data Compression, 5th ed. Morgan Kaufmann, 2017

2017

[24] [24]

The myriad virtues of Wavelet Trees,

P. Ferragina, R. Giancarlo, and G. Manzini, “The myriad virtues of Wavelet Trees, ” Information and Computation, vol. 207, no. 8, pp. 849–866, 2009, [Online]. Available: https://www.sciencedirect. com/science/article/pii/S0890540108001594

2009

[25] [25]

Practical Wavelet Tree Construction,

P. Dinklage, J. Ellert, J. Fischer, F. Kurpicz, and M. Löbel, “Practical Wavelet Tree Construction, ” ACM Journal of Experimental Algorithmics (JEA) , vol. 26, no. 1, pp. Article1.8, 67pp., 2021, doi: 10.1145/3457197

work page doi:10.1145/3457197 2021

[26] [26]

Succinct Indexable Dictionaries with Applications to Encoding $k$-ary Trees, Prefix Sums and Multisets

R. Raman, V. Raman, and S. R. Satti, “Succinct Indexable Dictionaries with Applications to Encoding k-ary Trees, Prefix Sums and Multisets, ” ACM Transactions on Algorithms, vol. 3, no. 4, p. Article43, 2007, [Online]. Available: https://arxiv.org/abs/0705.0552

work page internal anchor Pith review Pith/arXiv arXiv 2007

[27] [27]

Super-Scalar RAM-CPU Cache Compression,

M. Zukowski, S. Héman, N. Nes, and P. Boncz, “Super-Scalar RAM-CPU Cache Compression, ” in Proc. 22nd Intl. Conf. on Data Engineering (ICDE '06), IEEE Computer Society, 2006. doi: 10.1109/ ICDE.2006.150

2006

[28] [28]

Decoding billions of integers per second through vectorization,

D. Lemire and L. Boytsov, “Decoding billions of integers per second through vectorization, ” Software: Practice and Experience, vol. 45, no. 1, pp. 1–29, 2015, doi: 10.1002/spe.2203

work page doi:10.1002/spe.2203 2015

[29] [29]

The FastLanes Compression Layout: Decoding >100 Billion Integers per Second with Scalar Code,

A. Afroozeh and P. Boncz, “The FastLanes Compression Layout: Decoding >100 Billion Integers per Second with Scalar Code, ” Proceedings of the VLDB Endowment, vol. 16, no. 9, pp. 2132–2144, 2023, doi: 10.14778/3598581.3598587

work page doi:10.14778/3598581.3598587 2023

[30] [30]

Simple batch decoding of unary codes

F. Giesen, “Simple batch decoding of unary codes. ” [Online]. Available: https://fgiesen.wordpress. com/2026/05/30/simple-batch-decoding-of-unary-codes/ Appendices A Datasets Name #syms H (bits) Huffman (bits) min max Source / description calgary_pic 159 1.210 1.690 1 11 Calgary Corpus 1bpp CCITT scanned page, proba80-like real data (calgary_pic) chinese_...

2026