DynamicLogLog: Faster, Smaller, and More Accurate Cardinality Estimation

Brian Bushnell

arxiv: 2603.27405 · v2 · pith:W5MATT5Fnew · submitted 2026-03-28 · 💻 cs.DS

DynamicLogLog: Faster, Smaller, and More Accurate Cardinality Estimation

Brian Bushnell This is my paper

Pith reviewed 2026-05-14 21:16 UTC · model grok-4.3

classification 💻 cs.DS

keywords cardinality estimationHyperLogLogDynamicLogLogstreaming algorithmsdistinct countsketch data structuresmemory reduction

0 comments

The pith

DynamicLogLog stores relative leading-zero counts with a shared exponent to cut memory by one-third while removing HyperLogLog's error spike and adding an early-exit mask.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents DynamicLogLog as a direct replacement for HyperLogLog in stream cardinality estimation. It replaces per-bucket exponents with one global exponent and stores only the differences in leading-zero counts, which shrinks each bucket from six bits to four. This structure supports an early-exit mask that discards most elements before any bucket is touched and enables a hybrid estimator that blends linear counting with logarithmic counting to keep error flat across the full range. Simulations with two thousand buckets show mean absolute error near 1.83 percent using 1024 bytes, versus 1.84 percent mean but 34 percent peak error for HyperLogLog using 1536 bytes.

Core claim

DynamicLogLog uses a shared exponent across all buckets and stores only relative leading-zero counts. This yields four bits per bucket, an early-exit mask that rejects more than 99.9 percent of elements at high cardinality, and a Logarithmic Hybrid Blend that removes the transition-region error spike of standard HyperLogLog while keeping mean absolute error under two percent.

What carries the argument

Shared global exponent with relative leading-zero counts per bucket, combined with Dynamic Linear Counting and a Logarithmic Hybrid Blend for the final estimate.

If this is right

Memory for two thousand buckets drops to 1024 bytes while peak error stays near 1.83 percent.
Early-exit mask skips bucket accesses for over 99.9 percent of elements at high cardinality.
Maximum representable cardinality can be doubled by adding one global bit.
History-corrected and layered variants further reduce error using small per-state tables.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The four-bit layout may transfer to other leading-zero sketches that currently store full exponents.
Applications with clustered or bursty arrivals could see different early-exit rates than the uniform synthetic tests.
The hybrid blend suggests a general pattern for smoothing error transitions in other probabilistic counters.

Load-bearing premise

The accuracy and speed improvements observed on synthetic streams also hold for real-world data distributions.

What would settle it

Measure peak absolute error and early-exit rate on a large real-world trace whose exact distinct count is known in advance; if peak error rises above two percent or the early-exit fraction falls below 99 percent, the claimed gains do not generalize.

read the original abstract

Cardinality estimation - calculating the number of distinct elements in a stream - is a longstanding problem with applications from networking to bioinformatics. HyperLogLog (HLL), the prevailing standard, has a well-known error spike in its transition region and requires 6 bits per bucket, with data structure size scaling as B*log(log(cardinality)). We present DynamicLogLog (DLL), which uses a shared exponent across all buckets, storing only relative leading-zero counts. This yields three benefits: (1) only 4 bits per bucket (33% memory reduction), (2) an early exit mask that rejects >99.9% of elements at high cardinality before any bucket access (over 10x faster than HLL when bandwidth-constrained), and (3) a flat error profile via Dynamic Linear Counting (DLC) and a Logarithmic Hybrid Blend that eliminates HLL's transition artifact. Squaring the maximum representable cardinality requires only a single additional bit of global state. At 2,048 buckets with 512k simulations, DLL4's hybrid estimate achieves 1.830% mean and 1.834% peak absolute error using 1,024 bytes, compared to 1.84% mean and 34.1% peak for HLL using 1,536 bytes. DLC achieves 1.90% mean without correction factors. DynamicUltraLogLog (UDLL6), a fusion of DLL and UltraLogLog, achieves ULL-level accuracy at 75% of the memory. History-corrected variants (Hybrid+n) and Layered DLC (LDLC) provide further improvements using per-state correction tables and anti-phase error cancellation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DynamicLogLog cuts HLL memory by a third and removes the transition error spike via shared exponent and hybrid blending, with solid simulation numbers but no real-data checks.

read the letter

The core advance is the shared global exponent that stores only relative leading-zero counts per bucket. This drops the per-bucket cost from six bits to four, adds an early-exit mask that skips bucket access on most high-cardinality updates, and pairs Dynamic Linear Counting with a logarithmic blend to keep error flat instead of spiking in the transition region. At 2048 buckets the hybrid version reports 1.83 percent mean and peak absolute error in 1024 bytes, against HLL's 1.84 percent mean but 34 percent peak in 1536 bytes, all from 512k simulation runs. Those numbers are concrete and the design changes are straightforward to implement. The UltraLogLog fusion variant and the history-corrected tables are presented as further options without overclaiming. The simulations are large enough to make the reported deltas believable on uniform streams. The main limitation is that every accuracy and speed claim comes from synthetic data generated under the usual uniform-hash assumption. Real streams with locality, clustering, or adversarial patterns could lower the mask hit rate and let the blend weights reintroduce the peak error the paper eliminates in simulation. No correction factors or robustness bounds are given for that case, and no traces from networking or bioinformatics are tested. The paper is aimed at practitioners who maintain cardinality estimators in databases or streaming pipelines and want lower memory with bounded worst-case error. It is worth sending to a serious referee because the algorithmic changes are self-contained, the quantitative comparisons are direct, and the gap to real data is easy to state as a required revision rather than a fatal flaw.

Referee Report

1 major / 1 minor

Summary. The paper presents DynamicLogLog (DLL), an improvement over HyperLogLog (HLL) for cardinality estimation in streams. It uses a shared exponent across buckets to store only relative leading-zero counts, achieving 4 bits per bucket (vs. 6 in HLL), an early-exit mask rejecting >99.9% of elements at high cardinality, and a Logarithmic Hybrid Blend of Dynamic Linear Counting (DLC) and HLL to eliminate HLL's transition-region error spike. Claims include 33% memory reduction, over 10x speed in bandwidth-constrained settings, and superior accuracy: at 2048 buckets, DLL4 hybrid achieves 1.830% mean and 1.834% peak absolute error with 1024 bytes vs. HLL's 1.84% mean and 34.1% peak with 1536 bytes, based on 512k simulations. Variants like UDLL6 and history-corrected hybrids are also introduced.

Significance. If the empirical results hold beyond synthetic streams, the work offers a practical advance in streaming algorithms with immediate applicability to networking, databases, and bioinformatics. The 33% memory savings, early-exit optimization, and flattened error profile address longstanding HLL limitations without introducing correction factors in the base DLC case. Large-scale simulation evidence (512k runs) with concrete byte/error metrics strengthens the contribution, though real-world robustness remains to be confirmed.

major comments (1)

[Evaluation] Evaluation section (simulation results at 2048 buckets): the headline accuracy claims (1.830% mean / 1.834% peak error for DLL4 hybrid) rest solely on 512k synthetic uniform-hash streams. No analysis or bounds are provided for how the early-exit mask hit rate or implicit blend weights degrade under real-world locality, clustering, or adversarial patterns, which could reintroduce the 34% peak error the method claims to eliminate.

minor comments (1)

[Abstract] The abstract and evaluation could more explicitly state the exact bucket counts and memory formulas used for the 1024-byte vs 1536-byte comparison to aid reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment, the recommendation of minor revision, and the constructive comment on evaluation scope. We address the point below.

read point-by-point responses

Referee: Evaluation section (simulation results at 2048 buckets): the headline accuracy claims (1.830% mean / 1.834% peak error for DLL4 hybrid) rest solely on 512k synthetic uniform-hash streams. No analysis or bounds are provided for how the early-exit mask hit rate or implicit blend weights degrade under real-world locality, clustering, or adversarial patterns, which could reintroduce the 34% peak error the method claims to eliminate.

Authors: We agree that the reported results use synthetic uniform-hash streams, which is the standard approach in cardinality estimation literature to isolate algorithmic behavior under ideal hash assumptions. The early-exit mask and logarithmic hybrid blend are derived under this model, with the >99.9% rejection rate holding when hashes are uniform. We acknowledge the absence of explicit bounds or experiments on locality, clustering, or adversarial inputs, which could affect mask hit rates and blend weights. In the revised version we will add a dedicated paragraph in the Evaluation section clarifying the synthetic nature of the 512k simulations, noting that real-world deviation from uniformity may increase peak error, and stating that the 34.1% HLL spike is eliminated only under the tested conditions. We will also flag this as an item for future real-dataset validation. This constitutes a partial revision: we add discussion and caveats without new experiments or changed claims. revision: partial

Circularity Check

0 steps flagged

No circularity; algorithmic design validated on independent simulations

full rationale

The paper presents DynamicLogLog as a set of algorithmic changes (shared exponent, early-exit mask, DLC + logarithmic hybrid blend) whose accuracy is measured directly via 512k Monte Carlo simulations on synthetic streams. These error figures (1.830% mean / 1.834% peak at 1024 bytes) are empirical outputs, not quantities defined in terms of fitted parameters from the same data or reduced by construction to the input assumptions. No self-citations to load-bearing theorems, no uniqueness claims imported from prior author work, and no ansatz smuggled via citation appear in the text. The derivation chain is therefore self-contained against the external simulation benchmark.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 0 invented entities

The approach relies on standard probabilistic assumptions for leading-zero counts in hash streams and on the validity of the simulation regime; no new physical constants or ad-hoc fitted parameters are introduced beyond the choice of bucket count for benchmarking.

free parameters (1)

bucket count
2048 buckets chosen for the reported comparison; the structure itself scales with this parameter.

pith-pipeline@v0.9.0 · 5597 in / 1076 out tokens · 31881 ms · 2026-05-14T21:16:30.098530+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

DLL uses a shared exponent (minZeros) ... absNlz = minZeros + relNlz ... DLC(t) = 2^t ⋅ B ⋅ ln(B/V_t) ... Logarithmic Hybrid Blend
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

self-similar CF lookup: CF(2C) ≈ CF(C) ... tier promotion ... 15-tier range

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.