Stabilizing Private LASSO under Heterogeneous Covariates via Anisotropic Objective Perturbation

Ayaka Sakata; Haruka Tanzawa

arxiv: 2605.01492 · v1 · submitted 2026-05-02 · 📊 stat.ML · cs.IT· cs.LG· math.IT

Stabilizing Private LASSO under Heterogeneous Covariates via Anisotropic Objective Perturbation

Haruka Tanzawa , Ayaka Sakata This is my paper

Pith reviewed 2026-05-09 18:09 UTC · model grok-4.3

classification 📊 stat.ML cs.ITcs.LGmath.IT

keywords differential privacyLASSOobjective perturbationheterogeneous covariatesapproximate message passinghigh-dimensional regressionstate evolutionanisotropic perturbation

0 comments

The pith

Gram-based anisotropic perturbation stabilizes convergence in private high-dimensional LASSO under heterogeneous covariates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines high-dimensional LASSO estimation under differential privacy when covariates have varying scales. This heterogeneity creates effective anisotropy through the inverse Gram matrix, which destabilizes standard objective perturbation methods. Rather than normalize the data and spend extra privacy budget, the authors introduce a pre-distortion of the objective that uses the Gram matrix to restore isotropy. Analysis via Approximate Message Passing and state evolution shows the approach yields more stable convergence together with gains in statistical accuracy and privacy efficiency over uniform noise addition.

Core claim

A Gram-based anisotropic objective perturbation, implemented as a pre-distortion strategy, counteracts the distortion induced by heterogeneous covariate scales in the private objective. This restores isotropy in the estimation process. An Approximate Message Passing framework and state evolution analysis demonstrate that the perturbation stabilizes convergence and improves both statistical efficiency and privacy performance compared with standard uniform noise injection.

What carries the argument

The Gram-based anisotropic objective perturbation: a pre-distortion applied to the objective that uses the inverse Gram matrix of the covariates to neutralize scale-induced anisotropy without consuming additional privacy budget.

Load-bearing premise

The inverse Gram matrix fully captures the effective anisotropy induced by heterogeneous covariate scales and the proposed pre-distortion counteracts it without introducing new estimation bias or consuming additional privacy budget.

What would settle it

Empirical or theoretical results showing that mean squared error, convergence stability, or privacy-utility trade-off fail to improve, or become worse, when the Gram-based pre-distortion is applied versus uniform perturbation under heterogeneous covariate scales would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.01492 by Ayaka Sakata, Haruka Tanzawa.

**Figure 1.** Figure 1: Schematic of noise distributions under objective perturbation. Left and view at source ↗

**Figure 2.** Figure 2: Relationship between the threshold (dashed vertical lines) and the view at source ↗

**Figure 4.** Figure 4: λ-dependence of (a) the generalization error E and (b) the fraction of non-zero components ρb at α = 0.5, ρ = 0.1, σξ = 0.1, and ση = 0.1 for v1 (Uniform) and v2 (LogNormal). The legend is the same as in view at source ↗

**Figure 5.** Figure 5: Privacy-accuracy trade-off at α = 0.5, ρ = 0.1, and σξ = 0.1 for (a) setting v1 and (b) v2. IP denotes isotoropic perturbation and GP denotes Gram-based perturbation. where mb v = x (0) + p E/(αv)z and Rv = 1 1 − rbv ∂rbv ∂mb v 2 + ∂ 2 rbv ∂mb 2 v + rbv Σ2 vσ 2 η (v) , rbv = 1 2 ( erfc −mb v + λΣv √ 2Σvση(v) ! + erfc mb v + λΣv √ 2Σvση(v) !) view at source ↗

read the original abstract

We study high-dimensional LASSO under differential privacy via objective perturbation with heterogeneous covariate scales. In practical scenarios, covariates often exhibit diverse scales; however, standard preprocessing is problematic under privacy constraints, as it consumes additional privacy budget. This heterogeneity induces effective anisotropy in the objective perturbation via the inverse Gram matrix of covariates, which can degrade the stability and accuracy of algorithms. To address this, we propose a Gram-based anisotropic objective perturbation, a ``pre-distortion" strategy that counteracts the distortion from the covariate structure to restore isotropy in the estimation process. Using an Approximate Message Passing (AMP) framework and state evolution analysis, we demonstrate that our proposed perturbation significantly stabilizes convergence and improves both statistical efficiency and privacy performance compared to standard uniform noise injection. Our results provide theoretical insights into designing stable and efficient private estimators without relying on data-dependent preprocessing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proposes using the inverse Gram matrix to anisotropically adjust objective perturbation in private LASSO so heterogeneous covariates do not degrade performance, but the privacy accounting for that matrix is the part that needs checking.

read the letter

The main contribution is a pre-distortion of the perturbation term via the inverse Gram matrix of the covariates. This aims to restore isotropy in the effective noise without a separate preprocessing step that would spend privacy budget. The authors then apply AMP and state evolution to track how this change affects the fixed-point behavior and show gains in convergence stability and efficiency over standard uniform noise addition. That framing is a reasonable way to get concrete predictions instead of loose bounds. The approach is new enough relative to prior uniform perturbation work on private LASSO; it directly ties the distortion correction to the observed covariate structure rather than assuming isotropy from the start. The analysis appears to be carried out cleanly within the AMP framework they cite. The soft spot is exactly the one the stress-test note flags. The Gram matrix is a function of the private covariates, so feeding its inverse into the objective either leaks information or requires its own private estimation step. The abstract states the method avoids data-dependent preprocessing, yet the perturbation itself is data-dependent through the Gram. If the paper treats the Gram as known or public, the result only applies in limited settings; if it is privatized, the state-evolution recursion needs to absorb the extra noise or bias, and that extension is not visible in the abstract. This is not a load-bearing contradiction but it does mean the claimed privacy and utility improvements rest on an assumption that must be verified in the full proofs. The paper is aimed at people working on differentially private high-dimensional estimators who already use perturbation or AMP techniques. A reader who cares about closing the gap between theory and heterogeneous real data would get something concrete from it. It has enough technical grounding to deserve a serious referee, with the main questions being the precise privacy composition around the Gram and whether the empirical section backs the state-evolution predictions. I would send it to review.

Referee Report

1 major / 1 minor

Summary. The paper addresses high-dimensional LASSO estimation under differential privacy when covariates have heterogeneous scales. Standard objective perturbation with uniform noise is shown to suffer from effective anisotropy induced by the inverse Gram matrix of the covariates. The authors propose a Gram-based anisotropic objective perturbation (a pre-distortion strategy) that counteracts this structure to restore isotropy. They analyze the resulting estimator via the Approximate Message Passing (AMP) framework and state evolution, claiming improved convergence stability, statistical efficiency, and privacy-utility trade-off relative to isotropic noise injection, all without data-dependent preprocessing.

Significance. If the AMP analysis and privacy accounting are complete, the result offers a concrete mechanism for mitigating a common practical failure mode of private high-dimensional regression without expending extra privacy budget on preprocessing. This could inform the design of stable DP estimators in settings where feature scales vary, such as in private medical or financial data analysis. The use of state evolution to quantify both statistical and privacy performance is a strength, provided the fixed-point equations remain valid under the proposed perturbation.

major comments (1)

Abstract: The central claim that the Gram-based perturbation improves performance 'without relying on data-dependent preprocessing' is load-bearing for both the privacy guarantee and the validity of the state-evolution analysis. The inverse Gram matrix is a function of the private covariates; any mechanism that computes or injects it into the objective must itself be differentially private. The manuscript must explicitly state whether the Gram matrix is treated as public/known or is estimated privately, and must show that any resulting estimation error or bias is absorbed into the existing AMP state-evolution recursion without additional privacy cost or degradation of the claimed efficiency gains.

minor comments (1)

Abstract: The phrase 'significantly stabilizes convergence' is qualitative; the manuscript should report the precise improvement in the state-evolution fixed-point variance or convergence rate under the anisotropic perturbation versus uniform noise.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for raising this important point about the privacy status of the Gram matrix. We address the comment in detail below and will incorporate the necessary clarifications and analysis into a revised version.

read point-by-point responses

Referee: Abstract: The central claim that the Gram-based perturbation improves performance 'without relying on data-dependent preprocessing' is load-bearing for both the privacy guarantee and the validity of the state-evolution analysis. The inverse Gram matrix is a function of the private covariates; any mechanism that computes or injects it into the objective must itself be differentially private. The manuscript must explicitly state whether the Gram matrix is treated as public/known or is estimated privately, and must show that any resulting estimation error or bias is absorbed into the existing AMP state-evolution recursion without additional privacy cost or degradation of the claimed efficiency gains.

Authors: We thank the referee for this observation. The inverse Gram matrix is computed from the private covariates and is therefore part of the private mechanism; it is not treated as public or known a priori. The phrase 'without relying on data-dependent preprocessing' in the abstract refers specifically to avoiding separate, budget-consuming steps such as per-feature normalization or scaling that are common in non-private pipelines. The Gram-based anisotropic perturbation integrates the heterogeneity correction directly into the objective perturbation itself. In the revised manuscript we will (i) explicitly state in the abstract and in a new paragraph of Section 3 that the Gram matrix is estimated privately, (ii) allocate a fixed fraction of the total privacy budget to a private Gram estimator (e.g., via the Gaussian mechanism on the sufficient statistics), and (iii) extend the AMP state-evolution recursion to include an additional error term that captures the bias and variance of the private Gram estimate. Our analysis shows that this term is absorbed into the existing fixed-point equations without requiring extra privacy expenditure beyond the budget already allocated to the objective perturbation and without negating the claimed stability and efficiency improvements. We will include the updated state-evolution equations and a brief numerical verification of the absorption property. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation uses external AMP/state evolution on explicitly defined Gram-based perturbation

full rationale

The paper defines its anisotropic objective perturbation directly from the inverse Gram matrix of the covariates and then invokes the standard (externally established) Approximate Message Passing framework together with its state-evolution recursion to analyze convergence, efficiency, and privacy. No step equates the claimed improvement to a fitted parameter, a self-citation chain, or a renaming of the input; the Gram matrix enters as an observable structural quantity rather than a quantity fitted to the target performance metric. The analysis therefore remains self-contained against external benchmarks and does not reduce by construction to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the domain assumption that covariate heterogeneity creates anisotropy exactly captured by the inverse Gram matrix and that the proposed pre-distortion restores isotropy without side effects. No free parameters or new entities with independent evidence are described in the abstract.

axioms (1)

domain assumption Heterogeneous covariate scales induce effective anisotropy in objective perturbation via the inverse Gram matrix.
Invoked to motivate the need for and design of the Gram-based pre-distortion.

invented entities (1)

Gram-based anisotropic objective perturbation no independent evidence
purpose: Counteract distortion from covariate structure to restore isotropy in private LASSO estimation.
New strategy introduced to address the heterogeneity problem.

pith-pipeline@v0.9.0 · 5449 in / 1287 out tokens · 60160 ms · 2026-05-09T18:09:13.983341+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages · 1 internal anchor

[1]

Calibrating noise to sensitivity in private data analysis,

C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” inTheory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3. Springer, 2006, pp. 265–284

work page 2006
[2]

Differentially private empirical risk minimization,

K. Chaudhuri, C. Monteleoni, and A. D. Sarwate, “Differentially private empirical risk minimization,”Journal of Machine Learning Research, vol. 12, no. 3, 2011

work page 2011
[3]

The algorithmic foundations of differential privacy,

C. Dwork, A. Rothet al., “The algorithmic foundations of differential privacy,”Foundations and Trends® in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211–407, 2014

work page 2014
[4]

Privacy in the genomic era,

M. Naveed, E. Ayday, E. W. Clayton, J. Fellay, C. A. Gunter, J.-P. Hubaux, B. A. Malin, and X. Wang, “Privacy in the genomic era,”ACM Computing Surveys (CSUR), vol. 48, no. 1, pp. 1–44, 2015

work page 2015
[5]

The lasso method for variable selection in the cox model,

R. Tibshirani, “The lasso method for variable selection in the cox model,”Statistics in medicine, vol. 16, no. 4, pp. 385–395, 1997

work page 1997
[6]

Nearly optimal private lasso,

K. Talwar, A. Guha Thakurta, and L. Zhang, “Nearly optimal private lasso,”Advances in Neural Information Processing Systems, vol. 28, 2015

work page 2015
[7]

Privacy-Accuracy Trade-offs in High-Dimensional LASSO under Perturbation Mechanisms

A. Sakata and H. Tanzawa, “Privacy-accuracy trade-offs in high- dimensional lasso under perturbation mechanisms,”arXiv preprint arXiv:2603.26227, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[8]

The geometry of differential privacy: the sparse and approximate cases,

A. Nikolov, K. Talwar, and L. Zhang, “The geometry of differential privacy: the sparse and approximate cases,” inProceedings of the forty- fifth annual ACM symposium on Theory of computing, 2013, pp. 351– 360

work page 2013
[9]

On convergence of approximate message passing,

F. Caltagirone, L. Zdeborov ´a, and F. Krzakala, “On convergence of approximate message passing,” in2014 IEEE International Symposium on Information Theory. IEEE, 2014, pp. 1812–1816

work page 2014
[10]

On the conver- gence of approximate message passing with arbitrary matrices,

S. Rangan, P. Schniter, A. K. Fletcher, and S. Sarkar, “On the conver- gence of approximate message passing with arbitrary matrices,”IEEE Transactions on Information Theory, vol. 65, no. 9, pp. 5339–5351, 2019

work page 2019
[11]

Message-passing algo- rithms for compressed sensing,

D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing algo- rithms for compressed sensing,”Proceedings of the National Academy of Sciences, vol. 106, no. 45, pp. 18 914–18 919, 2009

work page 2009
[12]

Statistical physics of inference: Thresh- olds and algorithms,

L. Zdeborov ´a and F. Krzakala, “Statistical physics of inference: Thresh- olds and algorithms,”Advances in Physics, vol. 65, no. 5, pp. 453–552, 2016

work page 2016
[13]

Prediction errors for penalized regressions based on general- ized approximate message passing,

A. Sakata, “Prediction errors for penalized regressions based on general- ized approximate message passing,”Journal of Physics A: Mathematical and Theoretical, vol. 56, no. 4, p. 043001, 2023

work page 2023
[14]

The dynamics of message passing on dense graphs, with applications to compressed sensing,

M. Bayati and A. Montanari, “The dynamics of message passing on dense graphs, with applications to compressed sensing,”IEEE Transac- tions on Information Theory, vol. 57, no. 2, pp. 764–785, 2011

work page 2011
[15]

A typical reconstruction limit for compressed sensing based on l p-norm minimization,

Y . Kabashima, T. Wadayama, and T. Tanaka, “A typical reconstruction limit for compressed sensing based on l p-norm minimization,”Journal of Statistical Mechanics: Theory and Experiment, vol. 2009, no. 09, p. L09003, 2009

work page 2009
[16]

On-average kl-privacy and its equivalence to generalization for max-entropy mechanisms,

Y .-X. Wang, J. Lei, and S. E. Fienberg, “On-average kl-privacy and its equivalence to generalization for max-entropy mechanisms,” in International Conference on Privacy in Statistical Databases. Springer, 2016, pp. 121–134

work page 2016
[17]

Membership inference attacks against machine learning models,

R. Shokri, M. Stronati, C. Song, and V . Shmatikov, “Membership inference attacks against machine learning models,” in2017 IEEE symposium on security and privacy (SP). IEEE, 2017, pp. 3–18

work page 2017
[18]

Mezard and A

M. Mezard and A. Montanari,Information, physics, and computation. Oxford University Press, 2009

work page 2009
[19]

Orthogonal amp,

J. Ma and L. Ping, “Orthogonal amp,”IEEE Access, vol. 5, pp. 2020– 2033, 2017

work page 2020
[20]

Vector approximate message passing,

S. Rangan, P. Schniter, and A. K. Fletcher, “Vector approximate message passing,”IEEE Transactions on Information Theory, vol. 65, no. 10, pp. 6664–6684, 2019

work page 2019
[21]

Perturb-and-map random fields: Using discrete optimization to learn and sample from energy models,

G. Papandreou and A. L. Yuille, “Perturb-and-map random fields: Using discrete optimization to learn and sample from energy models,” in2011 international conference on computer vision. IEEE, 2011, pp. 193–200

work page 2011
[22]

Compressive sensing under matrix uncertainties: An approximate message passing approach,

J. T. Parker, V . Cevher, and P. Schniter, “Compressive sensing under matrix uncertainties: An approximate message passing approach,” in 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR). IEEE, 2011, pp. 804– 808

work page 2011

[1] [1]

Calibrating noise to sensitivity in private data analysis,

C. Dwork, F. McSherry, K. Nissim, and A. Smith, “Calibrating noise to sensitivity in private data analysis,” inTheory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3. Springer, 2006, pp. 265–284

work page 2006

[2] [2]

Differentially private empirical risk minimization,

K. Chaudhuri, C. Monteleoni, and A. D. Sarwate, “Differentially private empirical risk minimization,”Journal of Machine Learning Research, vol. 12, no. 3, 2011

work page 2011

[3] [3]

The algorithmic foundations of differential privacy,

C. Dwork, A. Rothet al., “The algorithmic foundations of differential privacy,”Foundations and Trends® in Theoretical Computer Science, vol. 9, no. 3–4, pp. 211–407, 2014

work page 2014

[4] [4]

Privacy in the genomic era,

M. Naveed, E. Ayday, E. W. Clayton, J. Fellay, C. A. Gunter, J.-P. Hubaux, B. A. Malin, and X. Wang, “Privacy in the genomic era,”ACM Computing Surveys (CSUR), vol. 48, no. 1, pp. 1–44, 2015

work page 2015

[5] [5]

The lasso method for variable selection in the cox model,

R. Tibshirani, “The lasso method for variable selection in the cox model,”Statistics in medicine, vol. 16, no. 4, pp. 385–395, 1997

work page 1997

[6] [6]

Nearly optimal private lasso,

K. Talwar, A. Guha Thakurta, and L. Zhang, “Nearly optimal private lasso,”Advances in Neural Information Processing Systems, vol. 28, 2015

work page 2015

[7] [7]

Privacy-Accuracy Trade-offs in High-Dimensional LASSO under Perturbation Mechanisms

A. Sakata and H. Tanzawa, “Privacy-accuracy trade-offs in high- dimensional lasso under perturbation mechanisms,”arXiv preprint arXiv:2603.26227, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[8] [8]

The geometry of differential privacy: the sparse and approximate cases,

A. Nikolov, K. Talwar, and L. Zhang, “The geometry of differential privacy: the sparse and approximate cases,” inProceedings of the forty- fifth annual ACM symposium on Theory of computing, 2013, pp. 351– 360

work page 2013

[9] [9]

On convergence of approximate message passing,

F. Caltagirone, L. Zdeborov ´a, and F. Krzakala, “On convergence of approximate message passing,” in2014 IEEE International Symposium on Information Theory. IEEE, 2014, pp. 1812–1816

work page 2014

[10] [10]

On the conver- gence of approximate message passing with arbitrary matrices,

S. Rangan, P. Schniter, A. K. Fletcher, and S. Sarkar, “On the conver- gence of approximate message passing with arbitrary matrices,”IEEE Transactions on Information Theory, vol. 65, no. 9, pp. 5339–5351, 2019

work page 2019

[11] [11]

Message-passing algo- rithms for compressed sensing,

D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing algo- rithms for compressed sensing,”Proceedings of the National Academy of Sciences, vol. 106, no. 45, pp. 18 914–18 919, 2009

work page 2009

[12] [12]

Statistical physics of inference: Thresh- olds and algorithms,

L. Zdeborov ´a and F. Krzakala, “Statistical physics of inference: Thresh- olds and algorithms,”Advances in Physics, vol. 65, no. 5, pp. 453–552, 2016

work page 2016

[13] [13]

Prediction errors for penalized regressions based on general- ized approximate message passing,

A. Sakata, “Prediction errors for penalized regressions based on general- ized approximate message passing,”Journal of Physics A: Mathematical and Theoretical, vol. 56, no. 4, p. 043001, 2023

work page 2023

[14] [14]

The dynamics of message passing on dense graphs, with applications to compressed sensing,

M. Bayati and A. Montanari, “The dynamics of message passing on dense graphs, with applications to compressed sensing,”IEEE Transac- tions on Information Theory, vol. 57, no. 2, pp. 764–785, 2011

work page 2011

[15] [15]

A typical reconstruction limit for compressed sensing based on l p-norm minimization,

Y . Kabashima, T. Wadayama, and T. Tanaka, “A typical reconstruction limit for compressed sensing based on l p-norm minimization,”Journal of Statistical Mechanics: Theory and Experiment, vol. 2009, no. 09, p. L09003, 2009

work page 2009

[16] [16]

On-average kl-privacy and its equivalence to generalization for max-entropy mechanisms,

Y .-X. Wang, J. Lei, and S. E. Fienberg, “On-average kl-privacy and its equivalence to generalization for max-entropy mechanisms,” in International Conference on Privacy in Statistical Databases. Springer, 2016, pp. 121–134

work page 2016

[17] [17]

Membership inference attacks against machine learning models,

R. Shokri, M. Stronati, C. Song, and V . Shmatikov, “Membership inference attacks against machine learning models,” in2017 IEEE symposium on security and privacy (SP). IEEE, 2017, pp. 3–18

work page 2017

[18] [18]

Mezard and A

M. Mezard and A. Montanari,Information, physics, and computation. Oxford University Press, 2009

work page 2009

[19] [19]

Orthogonal amp,

J. Ma and L. Ping, “Orthogonal amp,”IEEE Access, vol. 5, pp. 2020– 2033, 2017

work page 2020

[20] [20]

Vector approximate message passing,

S. Rangan, P. Schniter, and A. K. Fletcher, “Vector approximate message passing,”IEEE Transactions on Information Theory, vol. 65, no. 10, pp. 6664–6684, 2019

work page 2019

[21] [21]

Perturb-and-map random fields: Using discrete optimization to learn and sample from energy models,

G. Papandreou and A. L. Yuille, “Perturb-and-map random fields: Using discrete optimization to learn and sample from energy models,” in2011 international conference on computer vision. IEEE, 2011, pp. 193–200

work page 2011

[22] [22]

Compressive sensing under matrix uncertainties: An approximate message passing approach,

J. T. Parker, V . Cevher, and P. Schniter, “Compressive sensing under matrix uncertainties: An approximate message passing approach,” in 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR). IEEE, 2011, pp. 804– 808

work page 2011