pith. sign in

arxiv: 2601.21636 · v3 · pith:G36MMAIVnew · submitted 2026-01-29 · 💻 cs.LG · cs.CR· stat.ML

Sampling-Free Privacy Accounting for Matrix Mechanisms under Random Allocation

Pith reviewed 2026-05-21 13:55 UTC · model grok-4.3

classification 💻 cs.LG cs.CRstat.ML
keywords differential privacyprivacy amplificationmatrix mechanismsrandom allocationRényi divergencedynamic programmingballs-in-bins modelconditional composition
0
0 comments X

The pith

Privacy amplification for matrix mechanisms under random allocation can be computed without sampling via Rényi divergence and dynamic programming.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops sampling-free privacy bounds for differentially private model training that relies on matrix factorization with data allocated randomly according to the balls-in-bins model. It replaces Monte Carlo sampling with bounds derived from Rényi divergence, which are computed efficiently through a dynamic programming formulation, and supplements them with conditional composition to tighten guarantees at small privacy budgets. This removes the need for high-probability statements or random abstention required by prior approaches while working for any matrix, banded or not. A sympathetic reader cares because the method delivers deterministic accounting whose sample complexity does not grow with the inverse of the failure probability δ.

Core claim

We develop sampling-free bounds based on Rényi divergence and conditional composition. The former is facilitated by a dynamic programming formulation to efficiently compute the bounds. The latter complements it by offering stronger privacy guarantees for small ε, where Rényi divergence bounds inherently lead to an over-approximation. Our framework applies to arbitrary banded and non-banded matrices.

What carries the argument

Dynamic programming formulation that computes Rényi divergence bounds for privacy amplification under the balls-in-bins random allocation model, together with conditional composition for small-ε tightening.

If this is right

  • Privacy accounting holds with probability one instead of only with high probability.
  • Conditional composition yields tighter overall privacy loss than Rényi bounds alone when ε is small.
  • The same procedure covers both banded matrices common in practice and fully dense matrices.
  • No additional abstention step is required to obtain the stated guarantees.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could be adapted to other random subsampling schemes beyond balls-in-bins if the allocation probabilities admit a similar recurrence.
  • Tighter bounds might allow practitioners to train for more epochs or with larger batch sizes before hitting a fixed privacy budget.
  • The dynamic program could be further specialized for matrices with additional structure such as Toeplitz or circulant forms to reduce runtime.

Load-bearing premise

The dynamic programming procedure produces valid, non-vacuous Rényi divergence bounds for privacy amplification under random allocation for arbitrary matrices.

What would settle it

For a small matrix with only a few rows and columns, compare the privacy parameter obtained from the dynamic program against the exact value obtained by enumerating every possible allocation of records to bins.

read the original abstract

We study privacy amplification for differentially private model training with matrix factorization under random allocation (also known as the balls-in-bins model). Recent work by Choquette-Choo et al. (2025) proposes a sampling-based Monte Carlo approach to compute amplification parameters in this setting. However, their guarantees either only hold with some high probability or require random abstention by the mechanism. Furthermore, the required number of samples for ensuring $(\epsilon,\delta)$-DP is inversely proportional to $\delta$. In contrast, we develop sampling-free bounds based on R\'enyi divergence and conditional composition. The former is facilitated by a dynamic programming formulation to efficiently compute the bounds. The latter complements it by offering stronger privacy guarantees for small $\epsilon$, where R\'enyi divergence bounds inherently lead to an over-approximation. Our framework applies to arbitrary banded and non-banded matrices. Through numerical comparisons, we demonstrate the efficacy of our approach across a broad range of matrix mechanisms used in research and practice.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper develops sampling-free privacy bounds for matrix mechanisms under random (balls-in-bins) allocation. It replaces Monte Carlo sampling with Rényi divergence bounds computed via dynamic programming, augmented by conditional composition to tighten guarantees at small ε. The framework is claimed to apply to arbitrary banded and non-banded matrices and is supported by numerical comparisons.

Significance. A correct sampling-free method would eliminate the high-probability or abstention caveats of prior Monte Carlo approaches and remove the inverse dependence on δ, offering deterministic, efficient privacy accounting for practical matrix-factorization mechanisms in DP training.

major comments (2)
  1. [Dynamic Programming Formulation] The dynamic programming recurrence must be shown to compute valid (not merely heuristic) upper bounds on the Rényi divergence of the output distribution for arbitrary non-banded matrices. Any factorization or independence assumption in the state transitions would render the computed divergence either invalid or an uncontrolled over-approximation, directly undermining the sampling-free guarantee.
  2. [Dynamic Programming Formulation] The paper asserts that the DP works for both banded and non-banded cases, yet the state must track the exact joint distribution over which matrix entries receive how many samples under random allocation. A concrete proof or inductive argument establishing that the recurrence preserves this joint distribution is required.
minor comments (2)
  1. [Abstract] The abstract states that conditional composition offers stronger guarantees for small ε, but the precise regime (e.g., the crossover point with pure Rényi bounds) should be quantified in the main text.
  2. [Numerical Experiments] Numerical comparisons would benefit from explicit reporting of the matrix dimensions, band widths, and δ values used, to allow direct reproduction of the claimed efficacy.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback. We address each major comment below and will revise the manuscript to strengthen the formal justification of the dynamic programming approach.

read point-by-point responses
  1. Referee: The dynamic programming recurrence must be shown to compute valid (not merely heuristic) upper bounds on the Rényi divergence of the output distribution for arbitrary non-banded matrices. Any factorization or independence assumption in the state transitions would render the computed divergence either invalid or an uncontrolled over-approximation, directly undermining the sampling-free guarantee.

    Authors: We agree that a rigorous demonstration of validity is required. The current manuscript describes the DP recurrence but does not contain a complete inductive proof for arbitrary (including non-banded) matrices. In the revision we will add an induction on the number of allocated samples that shows the state transitions exactly preserve the joint distribution induced by random allocation, without introducing extraneous factorization or independence assumptions. The resulting values are therefore valid upper bounds on the Rényi divergence. revision: yes

  2. Referee: The paper asserts that the DP works for both banded and non-banded cases, yet the state must track the exact joint distribution over which matrix entries receive how many samples under random allocation. A concrete proof or inductive argument establishing that the recurrence preserves this joint distribution is required.

    Authors: We acknowledge the need for an explicit argument. The manuscript claims applicability to arbitrary matrices but relies on an implicit correspondence between the DP state and the balls-in-bins process. We will insert a detailed inductive argument in the revised version that defines the state as the occupancy vector over matrix entries and proves that each transition step maintains the exact joint distribution for both banded and non-banded cases. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation applies standard Rényi tools to random allocation via DP

full rationale

The paper derives sampling-free privacy bounds by applying established Rényi divergence and conditional composition to the balls-in-bins random allocation model for matrix mechanisms. The dynamic programming formulation is presented as a computational device to evaluate these divergences for arbitrary matrices, without evidence that the recurrence is defined in terms of the target bounds themselves or that any load-bearing step reduces to a fitted parameter or self-citation chain. The central claims rest on the correctness of the DP recurrence as a valid upper bound rather than on any self-referential construction. This is the normal case of an independent derivation; no quoted reduction to inputs by construction appears in the provided sections.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the validity of applying Rényi divergence and conditional composition to the random allocation model and on the correctness of the dynamic programming procedure for computing those bounds; no free parameters or new entities are mentioned.

axioms (2)
  • domain assumption Rényi divergence provides a valid bound on privacy amplification under random allocation for matrix mechanisms
    Basis for the sampling-free bounds described in the abstract.
  • ad hoc to paper Dynamic programming can efficiently compute the exact Rényi divergence values for arbitrary banded and non-banded matrices
    Enables the sampling-free computation claimed in the abstract.

pith-pipeline@v0.9.0 · 5699 in / 1337 out tokens · 66140 ms · 2026-05-21T13:55:15.598725+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.