Near-Optimal Private Linear Regression via Iterative Hessian Mixing

Ashia C. Wilson; Katrina Ligett; Moshe Shenfeld; Omri Lev; Vishwak Srinivasan

arxiv: 2601.07545 · v2 · pith:YE5VYCFJnew · submitted 2026-01-12 · 💻 cs.LG · stat.ML

Near-Optimal Private Linear Regression via Iterative Hessian Mixing

Omri Lev , Moshe Shenfeld , Vishwak Srinivasan , Katrina Ligett , Ashia C. Wilson This is my paper

Pith reviewed 2026-05-25 07:28 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords differential privacylinear regressionsketchinghessian mixingexcess empirical riskordinary least squaresprivate optimizationAdaSSP

0 comments

The pith

Iterative Hessian Mixing improves excess empirical risk bounds for differentially private OLS by removing a multiplicative sqrt(dimension) factor from AdaSSP.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Iterative Hessian Mixing as a sketching-based algorithm for differentially private ordinary least squares regression on bounded data. It proves differential privacy for IHM and derives excess empirical risk bounds that eliminate a dimension-dependent multiplicative penalty present in the prior AdaSSP method. New accuracy guarantees for Gaussian sketching approaches explain when sketching succeeds and how the iterative mixing step avoids prior limitations. Experiments across many datasets confirm that IHM outperforms AdaSSP and other baselines in practice.

Core claim

IHM is differentially private and its excess empirical risk bounds improve upon those of AdaSSP by removing a multiplicative factor that can be as large as the square root of the data dimension, using new accuracy guarantees for prior Gaussian sketching approaches that clarify when these methods perform well and how IHM circumvents their limitations.

What carries the argument

Iterative Hessian Mixing (IHM), an iterative refinement procedure that mixes information from Gaussian Hessian sketches to produce a private regression estimator.

If this is right

IHM is expected to perform well precisely when the derived Gaussian sketching accuracy guarantees apply.
The removal of the sqrt(dimension) factor yields tighter utility guarantees than direct sufficient-statistics perturbation.
Empirical evaluations show consistent outperformance over AdaSSP and sketching baselines on large suites of datasets.
The design clarifies conditions under which sketching-based private regression succeeds versus when it requires iterative correction.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar iterative mixing might reduce dimension penalties in other private convex optimization tasks beyond linear regression.
The approach suggests sketching can become competitive with or superior to direct perturbation once iterative refinement is applied.
High-dimensional private learning may become feasible without accuracy degrading proportionally to dimension.

Load-bearing premise

The data matrices and response vectors are bounded in norm, and the accuracy guarantees presented for Gaussian sketching methods hold without extra error terms.

What would settle it

An experiment on bounded data in which the measured excess empirical risk of IHM fails to improve over AdaSSP by the predicted factor, or in which the output distribution of IHM violates the claimed differential privacy guarantee.

read the original abstract

We study differentially private ordinary least squares (DP-OLS) with bounded data $(X,Y)$ via sketching-based mechanisms. While Gaussian sketching approaches have been explored for DP-OLS \citep{sheffet2017differentially}, they are typically viewed as less competitive than the Adaptive Sufficient Statistics Perturbation (AdaSSP) method \citep{wang_adassp}, which directly perturbs the sufficient statistics $(X^{\top}X, X^{\top}Y)$. This method was shown to be close to information-theoretically optimal, while also exhibiting strong empirical performance. In this work, we propose the \emph{Iterative Hessian Mixing} (IHM), an algorithm that builds on Gaussian sketching approaches to DP-OLS and is inspired by the Iterative Hessian Sketch of \citet{pilanci_hessiansketch}. We prove that IHM is differentially private and provide utility guarantees in the form of excess empirical risk bounds. These bounds improve upon those of AdaSSP by removing a multiplicative factor that can be as large as the square root of the data dimension. The design of the IHM is based on new accuracy guarantees that we present for prior Gaussian sketching approaches for DP-OLS, which clarify when these methods are expected to perform well and how IHM circumvents their inherent limitations. We also conduct a rigorous empirical evaluation on a large suite of datasets, demonstrating that IHM consistently outperforms prior baselines, including AdaSSP.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

IHM claims a sqrt(d) improvement over AdaSSP for DP-OLS via new sketching analysis, but that analysis is the load-bearing step.

read the letter

The core claim is that Iterative Hessian Mixing gives better excess empirical risk bounds for differentially private linear regression than AdaSSP by dropping a multiplicative sqrt(d) factor. The authors do this by extending Gaussian sketching with an iterative mixing step and by supplying new accuracy guarantees for earlier sketching methods that explain when those methods work and how IHM sidesteps their limits. They also report that IHM beats the baselines on a large collection of datasets while remaining private. That is the actual new piece: the algorithm plus the supporting analysis of sketching accuracy, not just another perturbation of the sufficient statistics. The empirical section is presented as rigorous, which is worth checking for dataset handling and variance reporting. The main soft spot is exactly where the stress-test note points: the utility gain depends on the new sketching guarantees being both correct and strong enough to remove the dimension factor when plugged into the IHM construction. Bounded data is standard, so that part is not the issue; the load-bearing part is whether the fresh analysis of Gaussian sketches holds without extra assumptions. If the proofs check out, the bound improvement is real and practically relevant for moderate-to-high dimension regimes. If they contain a gap, the headline result does not follow. The paper is aimed at people working on private convex optimization and DP linear models. It is the kind of targeted improvement on a core primitive that a serious referee should see, even if revisions are needed on the analysis or experiments. I would send it to review rather than desk reject.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes Iterative Hessian Mixing (IHM), a sketching-based algorithm for differentially private ordinary least squares regression on bounded data (X, Y). It claims to prove that IHM satisfies differential privacy, derives excess empirical risk bounds that improve on AdaSSP by removing a multiplicative factor up to sqrt(d), and supports these claims with new accuracy guarantees for prior Gaussian sketching methods. The work also reports empirical results showing consistent superiority over baselines including AdaSSP across a large suite of datasets.

Significance. If the new accuracy guarantees for Gaussian sketching are valid and the utility bounds follow, the result would advance DP linear regression by delivering near-optimal excess risk via sketching without the sqrt(d) penalty present in AdaSSP, while retaining computational benefits of sketching. The rigorous empirical evaluation on multiple datasets provides concrete evidence of practical gains and strengthens the contribution.

major comments (2)

[Section deriving new accuracy guarantees for Gaussian sketching] The new accuracy guarantees for prior Gaussian sketching approaches (abstract and the section deriving them): these are load-bearing for the headline claim that IHM removes the sqrt(d) multiplicative factor from AdaSSP bounds. The derivation must explicitly state all assumptions beyond bounded (X, Y), show how the guarantees circumvent sketching limitations, and demonstrate that no compensating dimension-dependent factors are introduced when plugged into the IHM construction.
[Theorem stating IHM utility bound] Excess empirical risk bound for IHM (the theorem stating the improved bound): the proof must contain an explicit side-by-side comparison with the AdaSSP bound to confirm removal of the sqrt(d) factor, and must verify that the improvement holds under only the stated bounded-data assumption without hidden dependencies on d or other parameters.

minor comments (1)

[Algorithm 1 and surrounding text] Notation for sketching matrices and Hessian mixing steps should be introduced with a single consolidated table or diagram to improve readability across the algorithmic description and analysis sections.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review. The positive assessment of the work's significance is appreciated. We respond to each major comment below and will make the requested clarifications in the revised manuscript.

read point-by-point responses

Referee: [Section deriving new accuracy guarantees for Gaussian sketching] The new accuracy guarantees for prior Gaussian sketching approaches (abstract and the section deriving them): these are load-bearing for the headline claim that IHM removes the sqrt(d) multiplicative factor from AdaSSP bounds. The derivation must explicitly state all assumptions beyond bounded (X, Y), show how the guarantees circumvent sketching limitations, and demonstrate that no compensating dimension-dependent factors are introduced when plugged into the IHM construction.

Authors: The section already derives the new guarantees under the bounded (X, Y) assumption and explains the circumvention mechanism via iterative Hessian mixing. To ensure full transparency as requested, we will revise the manuscript to add an explicit paragraph that lists all assumptions, details how the iterative approach avoids the sketching limitations of prior methods, and confirms via the bound expressions that no additional dimension-dependent factors arise when the guarantees are used in the IHM construction. revision: yes
Referee: [Theorem stating IHM utility bound] Excess empirical risk bound for IHM (the theorem stating the improved bound): the proof must contain an explicit side-by-side comparison with the AdaSSP bound to confirm removal of the sqrt(d) factor, and must verify that the improvement holds under only the stated bounded-data assumption without hidden dependencies on d or other parameters.

Authors: The IHM bound is obtained directly from the new sketching guarantees under the bounded-data assumption, which removes the sqrt(d) factor present in the AdaSSP analysis. We will revise the proof to include an explicit side-by-side comparison of the two bounds (via displayed equations) and add a remark confirming that the improvement holds solely under the stated bounded (X, Y) assumption with no hidden dependencies on d or other parameters. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper presents new accuracy guarantees for Gaussian sketching approaches to DP-OLS and constructs IHM to achieve improved excess risk bounds over AdaSSP. These guarantees and the IHM algorithm are derived from first principles in the manuscript rather than reducing to self-citations, fitted parameters renamed as predictions, or self-definitional steps. Cited prior works (Sheffet et al., Wang et al., Pilanci et al.) provide independent context but are not load-bearing for the novel analysis or bounds. No equations or claims in the provided text exhibit the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the bounded-data assumption and sketching accuracy guarantees are implicit but not quantified here.

pith-pipeline@v0.9.0 · 5799 in / 1045 out tokens · 43678 ms · 2026-05-25T07:28:31.699533+00:00 · methodology

Near-Optimal Private Linear Regression via Iterative Hessian Mixing

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)