State-Evolution-based Score Matching for Generalized Approximate Message Passing
Pith reviewed 2026-06-30 02:48 UTC · model grok-4.3
The pith
A neural network trained by state-evolution score matching can replace the exact MMSE denoiser in Bayes-GAMP for complex nonlinear models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under ideal training conditions, generalized approximate message passing that substitutes a neural network trained via state-evolution-based score matching for the analytically intractable MMSE denoiser asymptotically attains the same performance as Bayes-GAMP that uses the exact conditional-expectation denoiser.
What carries the argument
State-evolution-based score matching, which produces training samples whose distribution matches the asymptotic messages seen by the denoiser and thereby lets a network learn to emulate the required conditional expectation without an explicit formula.
If this is right
- Bayes-GAMP becomes usable for virtually arbitrary nonlinear observation mappings in complex-valued settings.
- Only forward evaluations of the nonlinear mapping are required during offline training; no closed-form denoiser is needed.
- When training conditions are ideal, the resulting algorithm matches the asymptotic performance of exact Bayes-GAMP.
- The same training procedure applies equally to real- and complex-valued generalized linear models.
Where Pith is reading between the lines
- The same state-evolution training idea could be used to approximate intractable steps in other message-passing algorithms.
- Finite-size and finite-iteration effects would need separate verification before claiming practical optimality.
- Because the method is agnostic to the explicit form of the nonlinearity, it may also simplify analysis of related expectation-propagation variants.
Load-bearing premise
State evolution must give a sufficiently accurate description of the message distributions so that the synthetic training data matches the statistics the denoiser will actually encounter during runtime.
What would settle it
Apply the trained network inside GAMP to a specific complex GLM whose exact Bayes-optimal mean-square error is known in closed form and test whether the empirical error converges to that value as problem dimension grows to infinity.
Figures
read the original abstract
Generalized approximate message passing (GAMP) equipped with minimum mean-square error (MMSE) denoisers, commonly referred to as Bayes-GAMP, is a powerful framework for solving inverse problems described by generalized linear models (GLMs) with arbitrary component-wise nonlinearities in the observation process. However, despite its theoretical tractability and rigorously established asymptotic optimality, the range of practical observation models for which Bayes-GAMP admits a closed-form implementation remains severely limited, particularly in complex-valued settings. This limitation largely stems from the restrictive requirement that the corresponding output denoiser, given by a conditional expectation, admit a closed-form expression. To overcome this limitation, we propose a principled approach that enables the implementation of Bayes-GAMP for complex-valued models with \emph{virtually arbitrary} nonlinear observation mappings. Specifically, within a score-matching framework, we train a neural network to emulate the output denoiser using training data generated from a characterization of the message dynamics based on state evolution (SE). Notably, the proposed approach requires neither explicit evaluation of the denoiser nor knowledge of an explicit functional form of the nonlinear mapping; it requires only access to forward evaluations of the mapping during offline training. We show that, under ideal training conditions, GAMP with the trained network replacing the analytically intractable denoiser asymptotically matches the performance of Bayes-GAMP with the exact denoiser.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes training a neural network via score matching to approximate the MMSE output denoiser in generalized approximate message passing (GAMP) for complex-valued generalized linear models with arbitrary nonlinearities. Training data is synthesized from state-evolution characterizations of the message dynamics, requiring only forward evaluations of the observation mapping. The central claim is that, under ideal training conditions, the resulting algorithm asymptotically matches the performance of Bayes-GAMP using the exact denoiser.
Significance. If the central claim holds, the method would substantially extend the applicability of asymptotically optimal Bayes-GAMP to observation models where closed-form conditional expectations are unavailable, which is common in complex-valued settings. The approach of using SE to generate training data without needing the closed-form denoiser is a notable technical strength.
major comments (3)
- [Abstract] Abstract: the performance claim is stated only under 'ideal training conditions' without a formal definition, verification procedure, or bound quantifying how closely practical training approximates these conditions; this assumption is load-bearing for the equivalence result.
- [Training data generation] The training-data generation procedure (described after the abstract): SE equations are derived under the assumption of the exact MMSE denoiser, yet no fixed-point consistency argument or perturbation analysis is supplied to show that the approximation error of the learned network leaves the effective observation statistics (and thus the SE parameters) invariant at deployment.
- [Main derivation] The manuscript provides no error bounds, convergence rates, or sensitivity analysis on how score-matching approximation error propagates through the GAMP iterations; this is required to substantiate the asymptotic matching claim beyond the ideal case.
minor comments (1)
- Notation for the complex-valued case and the score function could be clarified with an explicit comparison table to the real-valued setting.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help clarify the scope and limitations of our claims. We address each major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the performance claim is stated only under 'ideal training conditions' without a formal definition, verification procedure, or bound quantifying how closely practical training approximates these conditions; this assumption is load-bearing for the equivalence result.
Authors: We agree that the term 'ideal training conditions' requires a formal definition. In the revised manuscript we will explicitly define it as the regime in which the neural network achieves zero score-matching loss and thereby exactly recovers the MMSE denoiser (i.e., the network output coincides with the conditional expectation almost everywhere with respect to the state-evolution measure). We will also add a short verification procedure that compares the trained network against the exact denoiser on synthetic instances where the latter is available, and note that the attained training loss serves as a practical proxy for proximity to the ideal regime. Quantitative bounds on the deviation from ideal conditions are not derived, as this would require a separate approximation-theory analysis. revision: yes
-
Referee: [Training data generation] The training-data generation procedure (described after the abstract): SE equations are derived under the assumption of the exact MMSE denoiser, yet no fixed-point consistency argument or perturbation analysis is supplied to show that the approximation error of the learned network leaves the effective observation statistics (and thus the SE parameters) invariant at deployment.
Authors: Under the ideal-training definition we will introduce, the learned network coincides with the exact MMSE denoiser, so the state-evolution parameters used for training remain exactly consistent at deployment by construction. The manuscript's asymptotic claim is stated only for this ideal case; we do not assert invariance for arbitrary approximation error. In revision we will insert a clarifying remark after the training-data description that makes this distinction explicit and notes that continuity of the SE map with respect to the denoiser (in an appropriate topology) suggests that small deviations produce correspondingly small changes in the effective statistics, although a rigorous perturbation analysis is omitted. revision: partial
-
Referee: [Main derivation] The manuscript provides no error bounds, convergence rates, or sensitivity analysis on how score-matching approximation error propagates through the GAMP iterations; this is required to substantiate the asymptotic matching claim beyond the ideal case.
Authors: The stated asymptotic equivalence holds exclusively under ideal training conditions, where the approximation error is identically zero; consequently no propagation analysis is required for the theorem as written. The manuscript does not claim asymptotic optimality once the network deviates from the exact denoiser. Experiments illustrate that sufficiently accurate approximations yield performance close to Bayes-GAMP, but we make no theoretical claim beyond the ideal regime. We will revise the abstract, introduction, and conclusion to emphasize this scope limitation more explicitly. revision: yes
Circularity Check
No circularity: SE used as external tool to generate training data for ideal-case emulation
full rationale
The derivation generates training pairs from state-evolution characterizations of message statistics and trains a network via score matching to emulate the MMSE denoiser. The central claim is explicitly conditioned on 'ideal training conditions' under which the network exactly reproduces the denoiser, so that the GAMP trajectory and its SE remain unchanged by construction. No equation or step defines a quantity in terms of itself, renames a fitted parameter as a prediction, or reduces the result to a self-citation chain. State evolution is invoked as an independent, externally derived characterization rather than as a fitted input whose validity depends on the learned network. The paper is therefore self-contained against its stated assumptions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption State evolution provides an accurate large-system characterization of the message-passing dynamics that can be used to generate training data for the denoiser network.
Reference graph
Works this paper leans on
-
[1]
Pace and A
L. Pace and A. Salvan, Principles Of Statistical Inference From A Neo- fisherian Perspective , ser. Advanced Series On Statistical Science And Applied Probability. World Scientific Publishing Company, 1997
1997
-
[2]
Reverend Bayes on inference engines: a distri buted hierarchical approach,
J. Pearl, “Reverend Bayes on inference engines: a distri buted hierarchical approach,” in Proc. 2nd AAAI Conf. Artif. Intell. , ser. AAAI’82. AAAI Press, 1982, p. 133–136
1982
-
[3]
Factor gra phs and the sum-product algorithm,
F. Kschischang, B. Frey, and H.-A. Loeliger, “Factor gra phs and the sum-product algorithm,” IEEE Trans. Inf. Theory , vol. 47, no. 2, pp. 498–519, 2001
2001
-
[4]
Generalized approximate message passing fo r estimation with random linear mixing,
S. Rangan, “Generalized approximate message passing fo r estimation with random linear mixing,” in Proc. 2011 IEEE Int. Symp. Inf. Theory (ISIT), July 2011, pp. 2168–2172
2011
-
[5]
Generalized Approximate Message Passing for Estimation with Random Linear Mixing
S. Rangan, “Generalized approximate message passing fo r estimation with random linear mixing,” 2012. [Online]. Ava ilable: https://arxiv.org/abs/1010.5141
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[6]
Message-passi ng de- quantization with applications to compressed sensing,
U. S. Kamilov, V . K. Goyal, and S. Rangan, “Message-passi ng de- quantization with applications to compressed sensing,” IEEE Trans. Signal Process. , vol. 60, no. 12, pp. 6270–6281, 2012
2012
-
[7]
Bayes-optim al joint channel-and-data estimation for massive MIMO with lo w-precision ADCs,
C. Wen, C. Wang, S. Jin, K. Wong, and P . Ting, “Bayes-optim al joint channel-and-data estimation for massive MIMO with lo w-precision ADCs,” IEEE Trans. Signal Process. , vol. 64, no. 10, pp. 2541–2556, 2016
2016
-
[8]
Bayes - optimal MMSE detector for massive MIMO relaying with low-pr ecision ADCs/DACs,
X. Y ang, C.-K. Wen, S. Jin, and A. L. Swindlehurst, “Bayes - optimal MMSE detector for massive MIMO relaying with low-pr ecision ADCs/DACs,” IEEE Trans. Signal Process. , vol. 68, pp. 3341–3357, 2020
2020
-
[9]
Generalized approximate message-passin g for compressed sensing with sublinear sparsity,
K. Takeuchi, “Generalized approximate message-passin g for compressed sensing with sublinear sparsity,” IEEE Trans. Inf. Theory , vol. 71, no. 6, pp. 4602–4636, 2025
2025
-
[10]
Belief propagation receive rs for near- optimal detection of nonlinearly distorted OFDM signals,
S. V . Zhidkov and R. Dinis, “Belief propagation receive rs for near- optimal detection of nonlinearly distorted OFDM signals,” in Proc. IEEE V eh. Technol. Conf. (VTC-Spring), 2019, pp. 1–6
2019
-
[11]
Fast GAMP algorit hm for nonlinearly distorted OFDM signals,
C. Y ang, X. Liu, Y . L. Guan, and R. Liu, “Fast GAMP algorit hm for nonlinearly distorted OFDM signals,” IEEE Commun. Lett. , vol. 25, no. 5, pp. 1682–1686, 2021
2021
-
[12]
GAMP or GOAMP/GV AMP receiver in generalized linear system s: Achievable rate, coding principle, and comparative study,
Y . Chi, X. Chen, L. Liu, Y . Li, B. Bai, A. Al Hammadi, and C. Y uen, “GAMP or GOAMP/GV AMP receiver in generalized linear system s: Achievable rate, coding principle, and comparative study, ” IEEE Trans. Commun., vol. 72, no. 10, pp. 6237–6253, 2024. 11
2024
-
[13]
Compressive phase retrieva l via generalized approximate message passing,
P . Schniter and S. Rangan, “Compressive phase retrieva l via generalized approximate message passing,” IEEE Trans. Signal Processing , vol. 63, no. 4, pp. 1043–1055, 2015
2015
-
[14]
Phase retrieval from quantized measurements via approximate message passing,
J. Zhu, Q. Y uan, C. Song, and Z. Xu, “Phase retrieval from quantized measurements via approximate message passing,” IEEE Signal Process. Lett., vol. 26, no. 7, pp. 986–990, 2019
2019
-
[15]
Optimization-based AMP for phase retrieval: The impact of initialization and ℓ2 regularization,
J. Ma, J. Xu, and A. Maleki, “Optimization-based AMP for phase retrieval: The impact of initialization and ℓ2 regularization,” IEEE Trans. Inf. Theory , vol. 65, no. 6, pp. 3600–3629, 2019
2019
-
[16]
Estimation of non-normalized statist ical models by score matching,
A. Hyv¨ arinen, “Estimation of non-normalized statist ical models by score matching,” J. Mach. Learn. Res. , vol. 6, no. 24, pp. 695–709, 2005
2005
-
[17]
A connection between score matching and de noising au- toencoders,
P . Vincent, “A connection between score matching and de noising au- toencoders,” Neural Comput. , vol. 23, no. 7, pp. 1661–1674, 2011
2011
-
[18]
Message-pas sing al- gorithms for compressed sensing,
D. L. Donoho, A. Maleki, and A. Montanari, “Message-pas sing al- gorithms for compressed sensing,” Proc. of the National Academy of Sciences, vol. 106, no. 45, pp. 18 914–18 919, 2009
2009
-
[19]
The dynamics of message pas sing on dense graphs, with applications to compressed sensing,
M. Bayati and A. Montanari, “The dynamics of message pas sing on dense graphs, with applications to compressed sensing,” IEEE Trans. Inf. Theory , vol. 57, no. 2, pp. 764–785, 2011
2011
-
[20]
State evolution for gen eral approxi- mate message passing algorithms, with applications to spat ial coupling,
A. Javanmard and A. Montanari, “State evolution for gen eral approxi- mate message passing algorithms, with applications to spat ial coupling,” Inf. Inference: A Journal of the IMA , vol. 2, no. 2, pp. 115–144, 2013
2013
-
[21]
A unifying tutorial on approximate message passing,
O. Y . Feng, R. V enkataramanan, C. Rush, and R. J. Samwort h, “A unifying tutorial on approximate message passing,” F ound. Trends Mach. Learn., vol. 15, no. 4, pp. 335–536, 2022
2022
-
[22]
V ector appr oximate message passing,
S. Rangan, P . Schniter, and A. K. Fletcher, “V ector appr oximate message passing,” IEEE Trans. Inf. Theory , vol. 65, no. 10, pp. 6664–6684, 2019
2019
-
[23]
Generative modeling by estimatin g gradients of the data distribution,
Y . Song and S. Ermon, “Generative modeling by estimatin g gradients of the data distribution,” in Adv. Neural Inf. Process. Syst. (NeurIPS) , vol. 32, 2019, p. 11895–11907
2019
-
[24]
Score-based V AMP with Fi sher- information-based Onsager correction,
T. Wadayama and T. Takahashi, “Score-based V AMP with Fi sher- information-based Onsager correction,” 2026. [Online]. A vailable: https://arxiv.org/abs/2601.07095
-
[25]
Three-module SC-V AMP for LDPC-coded nonlinear ch annels,
——, “Three-module SC-V AMP for LDPC-coded nonlinear ch annels,”
-
[26]
Available: https://arxiv.org/abs/2604
[Online]. Available: https://arxiv.org/abs/2604. 19061
-
[27]
V ector appr oximate message passing for the generalized linear model,
P . Schniter, S. Rangan, and A. K. Fletcher, “V ector appr oximate message passing for the generalized linear model,” in Proc. Conf. Rec. Asilomar Conf. Signals Syst. , 2016, pp. 1525–1529
2016
-
[28]
Bayes-optimal unsupervised learning for channel estimat ion in near- field holographic MIMO,
W. Y u, H. He, X. Y u, S. Song, J. Zhang, R. Murch, and K. B. Le taief, “Bayes-optimal unsupervised learning for channel estimat ion in near- field holographic MIMO,” IEEE J. Sel. Top. Signal Process. , vol. 18, no. 4, pp. 714–729, 2024
2024
-
[29]
Score-based turbo m essage passing for plug-and-play compressive image recovery,
C. Cai, X. Y uan, and Y .-J. A. Zhang, “Score-based turbo m essage passing for plug-and-play compressive image recovery,” in Proc. IEEE Int. W orkshop Signal Process. Artif. Intell. Wireless Comm un. (SPAWC), 2025, pp. 1–5
2025
-
[30]
Orthogonal AMP,
J. Ma and L. Ping, “Orthogonal AMP,” IEEE Access , vol. 5, pp. 2020– 2033, 2017
2020
-
[31]
Rigorous dynamics of expectation-propa gation-based sig- nal recovery from unitarily invariant measurements,
K. Takeuchi, “Rigorous dynamics of expectation-propa gation-based sig- nal recovery from unitarily invariant measurements,” IEEE Trans. Inf. Theory, vol. 66, no. 1, pp. 368–386, 2020
2020
-
[32]
Concise de rivation for generalized approximate message passing using expecta tion prop- agation,
Q. Zou, H. Zhang, C.-K. Wen, S. Jin, and R. Y u, “Concise de rivation for generalized approximate message passing using expecta tion prop- agation,” IEEE Signal Process. Lett. , vol. 25, no. 12, pp. 1835–1839, 2018
2018
-
[33]
A unified Bayesian inference f ramework for generalized linear models,
X. Meng, S. Wu, and J. Zhu, “A unified Bayesian inference f ramework for generalized linear models,” IEEE Signal Process. Lett. , vol. 25, no. 3, pp. 398–402, 2018
2018
-
[34]
A new insi ght into GAMP and AMP,
L. Liu, Y . Li, C. Huang, C. Y uen, and Y . L. Guan, “A new insi ght into GAMP and AMP,” IEEE Trans. V eh. Technol., vol. 68, no. 8, pp. 8264–8269, 2019
2019
-
[35]
An empirical Bayes approach to statisti cs,
H. E. Robbins, “An empirical Bayes approach to statisti cs,” in Proc. Third Berkeley Symp. Math. Statist. Probab., V ol. 1, Berkeley, CA, USA, 1956, pp. 157–163
1956
-
[36]
Tweedie’s formula and selection bias,
B. Efron, “Tweedie’s formula and selection bias,” J. Amer . Statist. Assoc., vol. 106, no. 496, pp. 1602–1614, 2011
2011
-
[37]
Optimal errors and phase transitions in hig h- dimensional generalized linear models,
J. Barbier, F. Krzakala, N. Macris, L. Miolane, and L. Zdeborov´ a, “Optimal errors and phase transitions in hig h- dimensional generalized linear models,” Proc. Natl. Acad. Sci. , vol. 116, no. 12, pp. 5451–5460, 2019. [Online]. Available: https://www.pnas.org/doi/abs/10.1073/pnas.1802705116
-
[38]
Approximate messa ge passing with spectral initialization for generalized linear model s,
M. Mondelli and R. V enkataramanan, “Approximate messa ge passing with spectral initialization for generalized linear model s,” J. Stat. Mech.: Theory Exp. , vol. 2022, no. 11, p. 114003, Nov. 2022
2022
-
[39]
S. M. Kay, Fundamentals of Statistical Signal Processing: Estimatio n Theory. Englewood Cliffs, NJ: Prentice-Hall PTR, 1993, vol. 1
1993
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.