Understanding Phase Transitions via Mutual Information and MMSE
Pith reviewed 2026-05-25 09:27 UTC · model grok-4.3
The pith
The replica prediction for optimal inference performance in the standard linear model is exact.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For the standard linear model, the replica formulas derived from statistical physics give the exact values of the mutual information and the minimum mean-square error achieved by optimal inference, including the precise locations of phase transitions in estimation quality.
What carries the argument
The standard linear model (observations equal a linear transform of the signal plus noise) together with its mutual-information and MMSE curves that mark the phase transitions.
If this is right
- The locations of phase transitions in estimation quality are given by explicit, closed-form expressions rather than numerical approximations.
- Mutual information and MMSE in this model are linked by exact functional relationships that hold in the high-dimensional limit.
- Optimal inference performance can be predicted without constructing the estimator itself.
- The replica method supplies the correct asymptotic characterization for this canonical inference problem.
Where Pith is reading between the lines
- The same proof strategy may apply to other linear models with structured matrices or non-Gaussian noise.
- Design rules for measurement systems in compressed sensing could be derived directly from the replica thresholds.
- The exact formulas supply a benchmark against which practical algorithms can be compared in the large-system limit.
- Similar information-theoretic identities might characterize phase transitions in related high-dimensional estimation tasks.
Load-bearing premise
The standard linear model is both rich enough to be practically useful and simple enough to be studied rigorously.
What would settle it
A direct computation, for a concrete choice of signal prior and measurement matrix, showing that the finite-dimensional MMSE deviates from the replica formula by a fixed amount even as dimension tends to infinity.
Figures
read the original abstract
The ability to understand and solve high-dimensional inference problems is essential for modern data science. This article examines high-dimensional inference problems through the lens of information theory and focuses on the standard linear model as a canonical example that is both rich enough to be practically useful and simple enough to be studied rigorously. In particular, this model can exhibit phase transitions where an arbitrarily small change in the model parameters can induce large changes in the quality of estimates. For this model, the performance of optimal inference can be studied using the replica method from statistical physics but, until recently, it was not known if the resulting formulas were actually correct. In this chapter, we present a tutorial description of the standard linear model and its connection to information theory. We also describe the replica prediction for this model and outline the authors' recent proof that it is exact.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript is a tutorial on the standard linear model in high-dimensional inference. It connects the model to information-theoretic performance measures (mutual information and MMSE), describes the replica-symmetric predictions for phase transitions in estimation quality, and outlines the authors' recent proof that these predictions are exact.
Significance. If the exactness result holds, the work supplies a rigorous information-theoretic justification for replica formulas in a canonical model, enabling precise characterization of phase transitions without relying on heuristic statistical-physics arguments. The tutorial format also aids accessibility for researchers bridging statistical physics and information theory.
major comments (1)
- Abstract: the central claim that the replica prediction is exact rests on an outline of a recent external proof rather than a self-contained derivation. Without the full sequence of steps (e.g., any required limit interchanges, concentration results, or equivalence between mutual information and MMSE), the exactness assertion cannot be verified from the present text.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for highlighting the tutorial character of the manuscript. We address the single major comment below.
read point-by-point responses
-
Referee: [—] Abstract: the central claim that the replica prediction is exact rests on an outline of a recent external proof rather than a self-contained derivation. Without the full sequence of steps (e.g., any required limit interchanges, concentration results, or equivalence between mutual information and MMSE), the exactness assertion cannot be verified from the present text.
Authors: The manuscript is explicitly framed as a tutorial whose purpose is to describe the model, the replica-symmetric predictions, and the high-level structure of the exactness proof; the abstract already states that we 'outline' rather than fully derive the result. The complete technical development—including all limit interchanges, concentration arguments, and the precise equivalence between mutual information and MMSE—is contained in the companion paper that introduced the proof. We are happy to add an explicit citation to that work directly in the abstract and to insert a short paragraph that lists the main technical ingredients (without reproducing the full proofs) so that readers know precisely where each step is justified. revision: yes
Circularity Check
Replica exactness asserted via outline of authors' recent self-proof rather than self-contained derivation
specific steps
-
self citation load bearing
[Abstract]
"We also describe the replica prediction for this model and outline the authors' recent proof that it is exact."
The assertion that the replica prediction is exact is supported solely by referencing an outline of the authors' own recent proof. The central claim of exactness therefore reduces to self-citation whose details are external to this manuscript, rather than being established by a derivation chain contained in the paper.
full rationale
The paper's central claim is that the replica prediction for the standard linear model is exact. The abstract explicitly states that the manuscript 'outline[s] the authors' recent proof that it is exact' after noting that 'until recently, it was not known if the resulting formulas were actually correct.' This makes the load-bearing exactness assertion dependent on self-citation to the authors' own recent work, whose full technical steps (e.g., any limit interchanges or equivalences between mutual information and MMSE) are not reproduced here. This matches the self_citation_load_bearing pattern because the justification for the key result reduces to an outline of overlapping authors' prior proof rather than an independent derivation within the present text. No self-definitional, fitted-input, or ansatz-smuggling patterns are present in the given material, and the tutorial portions on the linear model appear self-contained.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
A mathematical theory of communication,
C. E. Shannon, “A mathematical theory of communication,” Bell System Technical Journal , vol. 27, pp. 379–423, 623–656, 1948
work page 1948
-
[2]
Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering,
T. Lesieur, C. De Bacco, J. Banks, F. Krzakala, C. Moore, and L . Zdeborov´ a, “Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering,” in Proceedings of the Allerton Conference on Communication, Control, and Computing , Monticello, IL, 2016
work page 2016
-
[3]
T. M. Cover and J. A. Thomas, Elements of Information Theory , 2nd ed. Wiley-Interscience, 2006
work page 2006
-
[4]
T. S. Han, Information-Spectrum Methods in Information Theory . Springer, 2004
work page 2004
-
[5]
M. M´ ezard and A. Montanari, Information, physics, and computation . Oxford University Press, 2009
work page 2009
-
[6]
Statistical physics of inferen ce: Thresholds and algorithms,
L. Zdeborov´ a and F. Krzakala, “Statistical physics of inferen ce: Thresholds and algorithms,” Advances in Physics , vol. 65, no. 5, pp. 453–552, 2016
work page 2016
-
[7]
A sequence of approximated solutions to the S-K mode l for spin glasses,
G. Parisi, “A sequence of approximated solutions to the S-K mode l for spin glasses,” Journal of Physics A: Mathematical and General , vol. 13, no. 4, pp. L115–L121, 1980
work page 1980
-
[8]
M. Talagrand, “The Parisi formula,” Annals of Mathmatics , vol. 163, no. 1, pp. 221–263, Jan. 2006
work page 2006
-
[9]
D. J. MacKay, Information Theory, Inference, and Learning Algorithms . Cambridge University Press, 2003
work page 2003
-
[10]
M. J. Wainwright and M. I. Jordan, Graphical Models, Exponential Families, and Variational I nference. Now Publisher Inc., 2008
work page 2008
-
[11]
A survey of stochastic simulation and optimization methods in signal processing,
M. Pereyra, P. Schniter, E. Chouzenoux, J.-C. Pesquet, J.-Y . Tourneret, A. O. Hero, and S. McLaughlin, “A survey of stochastic simulation and optimization methods in signal processing,” IEEE Journal of Selected Topics in Signal Processing , vol. 10, no. 2, pp. 224–241, Mar. 2016
work page 2016
-
[12]
Expectation consistent approximat e inference,
M. Opper and O. Winther, “Expectation consistent approximat e inference,” Journal of Machine Learn- ing Research, vol. 6, pp. 2177–2204, 2005
work page 2005
-
[13]
Pearl, Probabilistic reasoning in intelligent systems: Networks of plausible inference
J. Pearl, Probabilistic reasoning in intelligent systems: Networks of plausible inference . San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1998
work page 1998
-
[14]
Expectation propagation for approximate Bayes ian inference,
T. P. Minka, “Expectation propagation for approximate Bayes ian inference,” in Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence , ser. UAI ’01. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2001, pp. 362–369
work page 2001
-
[15]
Message-passing a lgorithms for compressed sensing,
D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing a lgorithms for compressed sensing,” Proceedings of the National Academy of Sciences , vol. 106, no. 45, pp. 18 914–18 919, Nov. 2009
work page 2009
-
[16]
Gaussian estimation: Sequence and wavelet models,
I. M. Johnstone, “Gaussian estimation: Sequence and wavelet models,” 2015, [Online]. Available: http://statweb.stanford.edu/∼ imj/
work page 2015
-
[17]
S. Foucart and H. Rauhut, A Mathematical Introduction to Compressive Sensing , ser. Applied and Numerical Harmonic Analysis. Birkh¨ auser, 2013
work page 2013
-
[18]
Y. C. Eldar and G. Kutyniok, Compressed Sensing Theory and Applications . Cambridge University Press, 2012
work page 2012
-
[19]
Randomly spread CDMA: Asymptotics via statistical physics,
D. Guo and S. Verd´ u, “Randomly spread CDMA: Asymptotics via statistical physics,” IEEE Transac- tions on Information Theory , vol. 51, no. 6, pp. 1983–2010, Jun. 2005. 25
work page 1983
-
[20]
The replica-symmetric prediction for compressed sensing with Gaussian matrices is exact,
G. Reeves and H. D. Pfister, “The replica-symmetric prediction for compressed sensing with Gaussian matrices is exact,” in Proceedings of the IEEE International Symposium on Informa tion Theory (ISIT) , Barcelona, Spain, Jul. 2016, pp. 665 – 669
work page 2016
-
[21]
The Replica-Symmetric Prediction for Compressed Sensing with Gaussian Matrices is Exact
——, “The replica-symmetric prediction for compressed sensing with Gaussian matrices is exact,” Jul. 2016, [Online]. Available: https://arxiv.org/abs/1607.02524
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[22]
Understanding the MMSE of compressed sensing o ne measurement at a time,
G. Reeves, “Understanding the MMSE of compressed sensing o ne measurement at a time,” Presented at the Institut Henri Poincar´ e Spring 2016 Thematic Program on t he Nexus of Information and Com- putation Theories, Paris, France, Mar. 2016, [Online]. Available: htt ps://youtu.be/vmd8-CMv04I
work page 2016
-
[23]
The dynamics of message passing on dense graphs, with applications to compressed sensing,
M. Bayati and A. Montanari, “The dynamics of message passing on dense graphs, with applications to compressed sensing,” IEEE Transactions on Information Theory , vol. 57, no. 2, pp. 764–785, Feb. 2011
work page 2011
-
[24]
Generalized approximate message passign for est imation with random linear mixing,
S. Rangan, “Generalized approximate message passign for est imation with random linear mixing,” in Proceedings of the IEEE International Symposium on Informa tion Theory (ISIT) , St. Petersburg, Russia, 2011, pp. 2174–2178
work page 2011
-
[25]
Expectation-maximization Gaussian- mixture approximate message passing,
J. P. Vila and P. Schniter, “Expectation-maximization Gaussian- mixture approximate message passing,” IEEE Transactions on Signal Processing , vol. 61, no. 19, pp. 4658–4672, Oct. 2013
work page 2013
-
[26]
Compressed sensing via universa l denoising and approximate message passing,
Y. Ma, J. Zhu, and D. Baron, “Compressed sensing via universa l denoising and approximate message passing,” IEEE Transactions on Signal Processing , vol. 64, no. 21, pp. 5611–5622, 2016
work page 2016
-
[27]
From denoising to c ompressed sensing,
C. A. Metzler, A. Maleki, and R. G. Baraniuk, “From denoising to c ompressed sensing,” IEEE Trans- actions on Information Theory , vol. 62, no. 9, pp. 5117–5144, 2016
work page 2016
-
[28]
Vector approxim ate message passing for the generalized linear model,
P. Schniter, S. Rangan, and A. K. Fletcher, “Vector approxim ate message passing for the generalized linear model,” in Asilomar Conference on Signals, Systems and Computers , Pacific Grove, CA, Nov. 2016
work page 2016
-
[29]
Vector approxim ate message passing,
S. Rangan, P. Schniter, and A. K. Fletcher, “Vector approxim ate message passing,” in Proceedings of the IEEE International Symposium on Information Theory (ISIT) , Aachen, Germany, 2017, pp. 1588–1592
work page 2017
-
[30]
A CDMA multiuser detection algorithm on the basis of belief propagatio,
Y. Kabashima, “A CDMA multiuser detection algorithm on the basis of belief propagatio,” Journal of Physics A: Mathematical and General , vol. 36, no. 43, pp. 11 111–11 121, 2003
work page 2003
-
[31]
Universality in polytop e phase transitions and iterative algorithms,
M. Bayati, M. Lelarge, and A. Montanari, “Universality in polytop e phase transitions and iterative algorithms,” in IEEE International Symposium on Information Theory , Boston, MA, Jul. 2012
work page 2012
-
[32]
Additivity of Information in Multilayer Networks via Additive Gaussian Noise Transforms
G. Reeves, “Additivity of information in multilayer networks via ad ditive Gaussian noise transforms,” in Proceedings of the Allerton Conference on Communication, C ontrol, and Computing , Monticello, IL, 2017, [Online]. Available https://arxiv.org/abs/1710.04580
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[33]
S-AMP: Approximat e message passing for general matrix ensembles,
B. C ¸ akmak, O. Winther, and B. H. Fleury, “S-AMP: Approximat e message passing for general matrix ensembles,” May 2014, [Online]. Available: http://arxiv.org/abs/1405 .2767
work page 2014
-
[34]
Expectation consistent approximate inference: Generalizations and convergence,
A. Fletcher, M. Sahree-Ardakan, S. Rangan, and P. Schniter , “Expectation consistent approximate inference: Generalizations and convergence,” in Proceedings of the IEEE International Symposium on Information Theory (ISIT) , Barcelona, Spain, 2016
work page 2016
-
[35]
Vector Approximate Message Passing
S. Rangan, P. Schniter, and A. K. Fletcher, “Vector approxim ate message passing,” Oct. 2016, [Online]. Available https://arxiv.org/abs/1610.03082
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[36]
Vector approxim ate message passing for the generalized linear model,
P. Schniter, S. Rangan, and A. K. Fletcher, “Vector approxim ate message passing for the generalized linear model,” Dec. 2016, [Online]. Available https://arxiv.org/abs/161 2.01186. 26
work page 2016
-
[37]
Dynamical Functional Theory for Compressed Sensing
B. C ¸ akmak, M. Opper, O. Winther, and B. H. Fleury, “Dynamica l functional theory for compressed sensing,” 2017, [Online]. Available https://arxiv.org/abs/1705.04284
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[38]
Generalized expectation consiste nt signal recovery for nonlinear measure- ments,
H. He, C.-K. Wen, and S. Jin, “Generalized expectation consiste nt signal recovery for nonlinear measure- ments,” in Proceedings of the IEEE International Symposium on Informa tion Theory (ISIT) , Aachen, Germany, 2017
work page 2017
-
[39]
Mutual information and minim um mean-square error in Gaussian channels,
D. Guo, S. Shamai, and S. Verd´ u, “Mutual information and minim um mean-square error in Gaussian channels,” IEEE Transactions on Information Theory , vol. 51, no. 4, pp. 1261–1282, Apr. 2005
work page 2005
-
[40]
Some inequalities satisfied by the quantities of infor mation of Fisher and Shannon,
A. J. Stam, “Some inequalities satisfied by the quantities of infor mation of Fisher and Shannon,” Infor- mation and Control , vol. 2, no. 2, pp. 101–112, Jun. 1959
work page 1959
-
[41]
Estimation in Gaussian noise: Properties of the minimum mean-square error,
D. Guo, Y. Wu, S. Shamai, and S. Verd´ u, “Estimation in Gaussian noise: Properties of the minimum mean-square error,” IEEE Transactions on Information Theory , vol. 57, no. 4, pp. 2371–2385, Apr. 2011
work page 2011
-
[42]
Mutual information a s a function of matrix snr for linear gaussian channels,
G. Reeves, H. D. Pfister, and A. Dytso, “Mutual information a s a function of matrix snr for linear gaussian channels,” in Proceedings of the IEEE International Symposium on Informa tion Theory (ISIT) , Vail, CO, Jun. 2018
work page 2018
-
[43]
K. Bhattad and K. R. Narayanan, “An MSE-based transfer ch art for analyzing iterative decoding schemes using a Gaussian approximation,” IEEE Transactions on Information Theory , vol. 58, no. 1, pp. 22–38, Jan. 2007
work page 2007
-
[44]
N. Merhav, D. Guo, and S. Shamai, “Statistical physics of signa l estimation in Gaussian noise: Theory and examples of phase transitions,” IEEE Transactions on Information Theory , vol. 56, no. 3, pp. 1400–1416, 2010
work page 2010
-
[45]
The generalized area theorem and some of its consequences,
C. M´ easson, A. Montanari, T. J. Richardson, and R. Urbanke , “The generalized area theorem and some of its consequences,” IEEE Transactions on Information Theory , vol. 55, no. 11, pp. 4793–4821, Nov. 2009
work page 2009
-
[46]
Conditional central limit theorems for Gaussian pr ojections,
G. Reeves, “Conditional central limit theorems for Gaussian pr ojections,” in Proceedings of the IEEE International Symposium on Information Theory (ISIT) , Aachen, Germany, Jun. 2017, pp. 3055–3059
work page 2017
-
[47]
Two-moment inequailties for R´ enyi entropy and mutual information,
——, “Two-moment inequailties for R´ enyi entropy and mutual information,” in Proceedings of the IEEE International Symposium on Information Theory (ISIT) , Aachen, Germany, Jun. 2017, pp. 664–668
work page 2017
-
[48]
Tight bounds on the capicty of bina ry input random CDMA systems,
S. B. Korada and N. Macris, “Tight bounds on the capicty of bina ry input random CDMA systems,” IEEE Transactions on Information Theory , vol. 56, no. 11, pp. 5590–5613, Nov. 2010
work page 2010
-
[49]
A. Montanari and D. Tse, “Analysis of belief propagation for no n-linear problems: The example of CDMA (or: How to prove Tanaka’s formula),” in Proceedings of the IEEE Information Theory Workshop (ITW), Punta del Este, Uruguay, 2006, pp. 160–164
work page 2006
-
[50]
Asymptotic MMSE analysis under spa rse representation modeling,
W. Huleihel and N. Merhav, “Asymptotic MMSE analysis under spa rse representation modeling,” Signal Processing, vol. 131, pp. 320–332, 2017
work page 2017
-
[51]
The sampling rate-distortion trad eoff for sparsity pattern recovery in compressed sensing,
G. Reeves and M. Gastpar, “The sampling rate-distortion trad eoff for sparsity pattern recovery in compressed sensing,” IEEE Transactions on Information Theory , vol. 58, no. 5, pp. 3065–3092, May 2012
work page 2012
-
[52]
Approximate sparsity pattern recovery: Information- theoretic lower bounds,
——, “Approximate sparsity pattern recovery: Information- theoretic lower bounds,” IEEE Transactions on Information Theory , vol. 59, no. 6, pp. 3451–3465, Jun. 2013. 27
work page 2013
-
[53]
The mutual info rmation in random linear estimation,
J. Barbier, M. Dia, N. Macris, and F. Krzakala, “The mutual info rmation in random linear estimation,” in Proceedings of the Allerton Conference on Communication, C ontrol, and Computing , Monticello, IL, 2016
work page 2016
-
[54]
Optimal Errors and Phase Transitions in High-Dimensional Generalized Linear Models
J. Barbier, F. Krzakala, N. Macris, L. Miolane, and L. Zdeborov ´ a, “Phase transitions, optimal er- rors and optimality of message-passing in generalized linear models,” A ug. 2017, [Online]. Available https://arxiv.org/abs/1708.03395
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[55]
Optimal errors and phase transitions in high-dimensional g eneralized linear models,
——, “Optimal errors and phase transitions in high-dimensional g eneralized linear models,” in Confer- ence On Learning Theory , 2018, pp. 728–731
work page 2018
-
[56]
Mult i-layer generalized linear estimation,
A. Manoel, F. Krzakala, M. M´ ezard, and L. Zdeborov´ a, “Mult i-layer generalized linear estimation,” in Proceedings of the IEEE International Symposium on Informa tion Theory (ISIT) , Aachen, Germany, 2017, pp. 2098–2102
work page 2017
-
[57]
Inference in dee p networks in high dimensions,
A. K. Fletcher, S. Rangan, and P. Schniter, “Inference in dee p networks in high dimensions,” in Pro- ceedings of the IEEE International Symposium on Informatio n Theory (ISIT) , Vail, CO, Jun. 2018
work page 2018
-
[58]
Mutual information for sym- metric rank-one matrix estimation: A proof of the replica formula,
J. Barbier, M. Dia, N. Macris, F. Krzakala, T. Lesieur, and L. Zd eborov´ a, “Mutual information for sym- metric rank-one matrix estimation: A proof of the replica formula,” in Advances in Neural Information Processing Systems (NIPS) , vol. 29, Barcelona, Spain, 2016, pp. 424–432
work page 2016
-
[59]
Fundamental limits of symmetric low-rank matrix estimation
M. Lelarge and L. Miolane, “Fundamental limits of symmetric low-r ank matrix estimation,” 2016, [On- line]. Available https://arxiv.org/abs/1611.03888
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[60]
T. Lesieur, F. Krzakala, and L. Zdeborov´ a, “Constrained low -rank matrix estimation: Phase transitions, approximate message passing and applications,” 2017, [Online]. Available https://arxiv.org/abs/1701.00858
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[61]
Community detection and stochastic block models: re cent developments,
E. Abbe, “Community detection and stochastic block models: re cent developments,” Mar. 2017, [Online]. Available https://arxiv.org/abs/1703.10146
-
[62]
Compresse d sensing under optimal quantization,
A. Kipnis, G. Reeves, Y. C. Eldar, and A. Goldsmith, “Compresse d sensing under optimal quantization,” in Proceedings of the IEEE International Symposium on Informa tion Theory (ISIT) , Aachen, Germany, Jun. 2017, pp. 2153–2157
work page 2017
-
[63]
Single letter formulas for quantized compressed sensing with Gaussian codebooks,
A. Kipnis, G. Reeves, and Y. C. Eldar, “Single letter formulas for quantized compressed sensing with Gaussian codebooks,” in Proceedings of the IEEE International Symposium on Informa tion Theory (ISIT), Vail, CO, Jun. 2018. 28
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.