pith. sign in

arxiv: 2604.10922 · v1 · submitted 2026-04-13 · 💻 cs.IT · math.IT· math.ST· stat.TH

α-Mutual Information for the Gaussian Noise Channel

Pith reviewed 2026-05-10 16:06 UTC · model grok-4.3

classification 💻 cs.IT math.ITmath.STstat.TH
keywords Sibson's α-mutual informationGaussian noise channelI-MMSE relationRényi entropyde Bruijn identitytilted distributionssignal-to-noise ratioinformation dimension
0
0 comments X

The pith

The derivative of α-mutual information with respect to SNR equals the MMSE under α-tilted distributions for the Gaussian noise channel.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines Sibson's α-mutual information on the additive Gaussian noise channel and develops a set of structural results that parallel those known for ordinary mutual information. It proves finiteness and continuity conditions, establishes strict convexity and concavity properties, and derives an α-I-MMSE identity that expresses the SNR derivative of α-mutual information through the minimum mean-square error computed on suitably tilted input distributions. A reader would care because the identity recovers the classical I-MMSE relation when α equals one and simultaneously supplies new estimation-based expressions for Rényi entropy and a generalized de Bruijn identity. The work further gives explicit low-SNR expansions that depend only on input variance and high-SNR limits that relate α-mutual information to Rényi entropy or α-information dimension.

Core claim

The central claim is that an α-I-MMSE relationship holds for the Gaussian channel: the derivative of Sibson's α-mutual information with respect to signal-to-noise ratio equals the minimum mean-square error evaluated under the corresponding α-tilted distributions. This identity implies a generalized de Bruijn identity and yields estimation-theoretic representations of Rényi entropy and differential Rényi entropy. In addition, α-mutual information admits a low-SNR expansion determined solely by input variance and, at high SNR, converges to the Rényi entropy of order 1/α for discrete inputs or connects to α-information dimension for general inputs.

What carries the argument

The α-I-MMSE relationship, which equates the derivative of α-mutual information with respect to SNR to the MMSE under α-tilted distributions.

If this is right

  • The α-I-MMSE identity supplies a direct way to obtain the SNR derivative of α-mutual information from an estimation quantity.
  • Low-SNR α-mutual information depends only on the variance of the input.
  • For discrete inputs the high-SNR limit of α-mutual information equals the Rényi entropy of order 1/α.
  • Rényi entropy and differential Rényi entropy admit new representations in terms of estimation errors under tilted distributions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The tilted-distribution construction may permit numerical evaluation of α-mutual information by reusing existing MMSE estimators.
  • Strict concavity properties could guarantee uniqueness in optimization problems that maximize or minimize α-mutual information.
  • The same regularity framework might be testable on other additive noise channels to check whether the α-I-MMSE relation survives beyond the Gaussian case.

Load-bearing premise

The assumption that α-mutual information remains finite and differentiable with respect to SNR for the input distributions and α values of interest.

What would settle it

A specific input distribution and α value for which the derivative of α-mutual information with respect to SNR fails to equal the MMSE computed under the corresponding tilted distribution.

Figures

Figures reproduced from arXiv: 2604.10922 by Alex Dytso, Martina Cardone, Mohammad Milanian.

Figure 1
Figure 1. Figure 1: Iα(X; snr) and the corresponding α-MMSE mmseα(X; snr) versus snr and different values of α. B. α-I-MMSE Relationship We here present an expression that connects α-mutual information and MMSE. In particular, this result shows that the rate of α-mutual information increase as SNR increases is equal to a fraction α/2 of the MMSE achieved by the optimal estimator of Xα given Yα. The main result, which generali… view at source ↗
Figure 2
Figure 2. Figure 2: Iα(X; snr) and the corresponding α-MMSE mmseα(X; snr) versus α and different values of snr. C. On Generalized de Bruijn’s Identity The classical I-MMSE relationship is known to be equivalent to the de Bruijn’s identity [57], which relates the derivative of the Shannon differential entropy to the Fisher information. This connection is typically established via Brown’s identity. Equipped with our generalizat… view at source ↗
read the original abstract

In this paper, we study Sibson's $\alpha$-mutual information in the context of the additive Gaussian noise channel. While the classical case $\alpha = 1$ is well understood and admits deep connections to estimation-theoretic quantities, such as the minimum mean-square error (MMSE) and Fisher information, many of the corresponding structural properties for general $\alpha$ remain less explored. Our goal is to develop a systematic understanding of $\alpha$-mutual information in the Gaussian noise setting and to identify which properties extend beyond the Shannon case. To this end, we establish several regularity properties, including finiteness conditions, continuity with respect to the signal-to-noise ratio (SNR) and the input distribution, and strict concavity/convexity properties that ensure uniqueness in associated optimization problems. A central contribution is the development of an $\alpha$-I-MMSE relationship, generalizing the classical identity by relating the derivative of $\alpha$-mutual information with respect to SNR to the MMSE evaluated under appropriately tilted distributions. This connection further leads to a generalized de Bruijn identity and new estimation-theoretic representations of R\'enyi entropy and differential R\'enyi entropy. We also characterize the low- and high-SNR behavior. In the low-SNR regime, the first-order behavior depends only on the input variance. In the high-SNR regime, for discrete inputs, $\alpha$-mutual information converges to the R\'enyi entropy of order $1/\alpha$, while for general inputs we connect it to $\alpha$-information dimension. Overall, our results show that many fundamental relationships between information and estimation extend beyond the Shannon setting, in a form involving $\alpha$-tilted distributions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The paper studies Sibson's α-mutual information for the additive Gaussian noise channel. It establishes finiteness conditions, continuity in SNR and input distribution, and strict concavity/convexity properties. A central result is an α-I-MMSE identity relating the derivative of α-MI w.r.t. SNR to the MMSE under α-tilted output distributions; this yields a generalized de Bruijn identity and new representations of Rényi entropy. The work also characterizes low-SNR behavior (depending only on input variance) and high-SNR asymptotics (convergence to Rényi entropy of order 1/α for discrete inputs, or α-information dimension for general inputs).

Significance. If the regularity conditions and derivative identities hold rigorously, the α-I-MMSE relation and its consequences provide a systematic extension of classical information-estimation links (I-MMSE, de Bruijn) to the Rényi/α setting for Gaussian channels. The low- and high-SNR characterizations and convexity results are useful for optimization problems involving α-MI. The manuscript supplies concrete asymptotic expressions and tilted-distribution representations, which are falsifiable and potentially reusable.

major comments (2)
  1. [§4 (α-I-MMSE theorem and proof)] The central α-I-MMSE identity (abstract and §4) requires differentiability of α-MI w.r.t. SNR and interchange of derivative with the integral representation of α-MI. The paper invokes finiteness, continuity, and regularity properties to justify this, but does not explicitly verify that the α-tilted measures remain valid probability distributions (i.e., the normalizing constant is finite and the density exists) for all inputs where α-MI is declared finite, particularly for unbounded-support continuous inputs at finite SNR. This is load-bearing for the derivative claim.
  2. [§3 (regularity properties) and §4] The finiteness conditions for α-MI (stated in §3) are given, but the manuscript does not supply an explicit check or counter-example showing that these conditions automatically guarantee the existence of the tilted density and the validity of the derivative identity when the input has slow-decaying tails. Without this, the scope of the α-I-MMSE relation remains unclear.
minor comments (3)
  1. [§4] The definition of the α-tilted output distribution (Eq. (X) in §4) should be written explicitly with the normalizing constant shown, to make the subsequent MMSE expression immediately verifiable.
  2. [§5 (high-SNR asymptotics)] Notation for the α-MI functional and the tilting parameter should be introduced once and used consistently; occasional reuse of I_α without the channel subscript creates minor ambiguity in the high-SNR section.
  3. [§6] The low-SNR expansion (first-order term depending only on variance) is stated cleanly, but a short remark on whether higher-order terms involve higher moments would help readers compare with the classical I-MMSE expansion.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review. The comments correctly identify points where the exposition of regularity conditions and the justification for the derivative identity can be strengthened. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [§4 (α-I-MMSE theorem and proof)] The central α-I-MMSE identity (abstract and §4) requires differentiability of α-MI w.r.t. SNR and interchange of derivative with the integral representation of α-MI. The paper invokes finiteness, continuity, and regularity properties to justify this, but does not explicitly verify that the α-tilted measures remain valid probability distributions (i.e., the normalizing constant is finite and the density exists) for all inputs where α-MI is declared finite, particularly for unbounded-support continuous inputs at finite SNR. This is load-bearing for the derivative claim.

    Authors: We agree that an explicit verification would strengthen the argument. The finiteness of α-MI is defined through the finiteness of the integral appearing in its expression, which is precisely the normalizing constant of the α-tilted output measure; hence the tilted object is a probability distribution whenever α-MI is finite. Because the channel is Gaussian, absolute continuity with respect to Lebesgue measure is preserved under the tilting operation for any input distribution that induces a well-defined output density. For inputs with unbounded support the moment conditions implicit in the finiteness statement of §3 already control the tails sufficiently for the dominated-convergence argument used to interchange differentiation and integration. In the revision we will insert a short lemma (or dedicated remark) immediately preceding the α-I-MMSE theorem that states and proves these facts, thereby making the scope of the identity fully explicit. revision: yes

  2. Referee: [§3 (regularity properties) and §4] The finiteness conditions for α-MI (stated in §3) are given, but the manuscript does not supply an explicit check or counter-example showing that these conditions automatically guarantee the existence of the tilted density and the validity of the derivative identity when the input has slow-decaying tails. Without this, the scope of the α-I-MMSE relation remains unclear.

    Authors: The finiteness conditions listed in §3 are formulated exactly so that the relevant integrals remain finite after the Gaussian convolution and the subsequent α-tilting; the Gaussian kernel supplies enough smoothing that slow polynomial decay of the input tails does not prevent the tilted density from existing. We therefore do not expect counter-examples inside the regime where α-MI is declared finite. To remove any ambiguity we will add, in the revised §3, a brief paragraph together with a concrete illustration (e.g., a Student-t input with degrees of freedom chosen so that α-MI remains finite) confirming that the tilted density exists and that the derivative identity continues to hold. This addition will clarify the applicability of the α-I-MMSE relation without altering the stated theorems. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the α-I-MMSE derivation

full rationale

The paper establishes finiteness, continuity, and convexity/concavity properties of α-mutual information independently before deriving the α-I-MMSE identity as a consequence of differentiating the α-MI functional with respect to SNR and relating it to MMSE under α-tilted distributions constructed directly from the input and Gaussian channel law. No quoted step reduces the central claim to a self-definition, a fitted parameter renamed as prediction, or a load-bearing self-citation chain. The derivation is presented as extending the classical I-MMSE relation via explicit tilted measures rather than by tautology or renaming. The provided abstract and context show a self-contained chain against the definitions of α-MI and the channel.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on standard measure-theoretic assumptions for mutual information to be finite, plus new definitions of α-tilted distributions whose validity is asserted under finiteness conditions stated in the abstract.

axioms (2)
  • domain assumption α-mutual information is finite and differentiable with respect to SNR under the stated regularity conditions
    Invoked to justify the derivative identities and continuity claims
  • standard math Standard properties of Rényi entropy and differential entropy carry over to the α-tilted measures
    Used for the generalized de Bruijn identity and high-SNR limits

pith-pipeline@v0.9.0 · 5617 in / 1618 out tokens · 63069 ms · 2026-05-10T16:06:04.329986+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

76 extracted references · 76 canonical work pages

  1. [1]

    On Measures of Entropy and Information,

    A. R ´enyi, “On Measures of Entropy and Information,” inProceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. University of California Press, 1961, pp. 547–562

  2. [2]

    Cumulant Generating Function of Codeword Lengths in Optimal Lossless Compression,

    T. A. Courtade and S. Verd ´u, “Cumulant Generating Function of Codeword Lengths in Optimal Lossless Compression,” in2014 IEEE International Symposium on Information Theory, 2014, pp. 2494–2498

  3. [3]

    R ´enyi’s Entropy and the Probability of Error,

    M. Ben-Bassat and J. Raviv, “R ´enyi’s Entropy and the Probability of Error,”IEEE Transactions on Information Theory, vol. 24, no. 3, pp. 324–331, 2003

  4. [4]

    Arimoto-R´enyi Conditional Entropy and Bayesianm-ary Hypothesis Testing,

    I. Sason and S. Verd ´u, “Arimoto-R´enyi Conditional Entropy and Bayesianm-ary Hypothesis Testing,”IEEE Transactions on Information theory, vol. 64, no. 1, pp. 4–25, 2017

  5. [5]

    An Inequality on Guessing and its Application to Sequential Decoding,

    E. Arikan, “An Inequality on Guessing and its Application to Sequential Decoding,”IEEE Transactions on Information Theory, vol. 42, no. 1, pp. 99–105, 2002

  6. [6]

    A Primer on Alpha-Information Theory with Application to Leakage in Secrecy Systems,

    O. Rioul, “A Primer on Alpha-Information Theory with Application to Leakage in Secrecy Systems,” inInternational Conference on Geometric Science of Information. Springer, 2021, pp. 459–467

  7. [7]

    Information Measures and Capacity of Orderαfor Discrete Memoryless Channels,

    S. Arimoto, “Information Measures and Capacity of Orderαfor Discrete Memoryless Channels,”Topics in Information Theory, 1977

  8. [8]

    Information Radius,

    R. Sibson, “Information Radius,”Zeitschrift f ¨ur Wahrscheinlichkeitstheorie und verwandte Gebiete, vol. 14, no. 2, pp. 149–160, 1969

  9. [9]

    Generalized Cutoff Rates and R ´enyi’s Information Measures,

    I. Csisz ´ar, “Generalized Cutoff Rates and R ´enyi’s Information Measures,”IEEE Transactions on information theory, vol. 41, no. 1, pp. 26–34, 2002

  10. [10]

    Noisy Channels,

    U. Augustin, “Noisy Channels,” Ph.D. dissertation, Universit ¨at Erlangen–N ¨urnberg, 1978, Habilitation Thesis

  11. [11]

    Two Measures of Dependence,

    A. Lapidoth and C. Pfister, “Two Measures of Dependence,”Entropy, vol. 21, no. 8, 2019

  12. [12]

    Sibsonα-Mutual Information and Its Variational Representations,

    A. R. Esposito, M. Gastpar, and I. Issa, “Sibsonα-Mutual Information and Its Variational Representations,”IEEE Transactions on Information Theory, pp. 1–1, 2025

  13. [13]

    Arimoto Channel Coding Converse and R ´enyi Divergence,

    Y . Polyanskiy and S. Verd ´u, “Arimoto Channel Coding Converse and R ´enyi Divergence,” in2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2010, pp. 1327–1333

  14. [14]

    Generalization Error Bounds via R ´enyi-, f-Divergences and Maximal Leakage,

    A. R. Esposito, M. Gastpar, and I. Issa, “Generalization Error Bounds via R ´enyi-, f-Divergences and Maximal Leakage,”IEEE Transactions on Information Theory, vol. 67, no. 8, pp. 4986–5004, 2021

  15. [15]

    On Meta-Bound for Lower Bounds of Bayes Risk,

    S. Saito, “On Meta-Bound for Lower Bounds of Bayes Risk,” in2022 IEEE International Symposium on Information Theory (ISIT), 2022, pp. 3162–3167

  16. [16]

    Mutual Information and Minimum Mean-Square Error in Gaussian Channels,

    D. Guo, S. Shamai (Shitz), and S. Verd ´u, “Mutual Information and Minimum Mean-Square Error in Gaussian Channels,”IEEE Transactions on Information Theory, vol. 51, no. 4, pp. 1261–1282, 2005

  17. [17]

    Derivative of Mutual Information at Zero SNR: The Gaussian-Noise Case,

    Y . Wu, D. Guo, and S. Verd ´u, “Derivative of Mutual Information at Zero SNR: The Gaussian-Noise Case,”IEEE Transactions on Information Theory, vol. 57, no. 11, pp. 7307–7312, 2011

  18. [18]

    α-Mutual Information,

    S. Verd ´u, “α-Mutual Information,” in2015 Information Theory and Applications Workshop (ITA), 2015, pp. 1–6

  19. [19]

    The Zero Error Capacity of a Noisy Channel,

    C. Shannon, “The Zero Error Capacity of a Noisy Channel,”IRE Transactions on Information Theory, vol. 2, no. 3, pp. 8–19, 1956

  20. [20]

    A Simple Derivation of the Coding Theorem and Some Applications,

    R. Gallager, “A Simple Derivation of the Coding Theorem and Some Applications,”IEEE Transactions on Information Theory, vol. 11, no. 1, pp. 3–18, 1965

  21. [21]

    Variable-Length Lossy Compression and Channel Coding: Non-Asymptotic Converses via Cumulant Generating Functions,

    T. A. Courtade and S. Verd ´u, “Variable-Length Lossy Compression and Channel Coding: Non-Asymptotic Converses via Cumulant Generating Functions,” in2014 IEEE International Symposium on Information Theory, 2014, pp. 2499–2503

  22. [22]

    On the Converse to the Coding Theorem for Discrete Memoryless Channels (corresp.),

    S. Arimoto, “On the Converse to the Coding Theorem for Discrete Memoryless Channels (corresp.),”IEEE Transactions on Information Theory, vol. 19, no. 3, pp. 357–359, 1973

  23. [23]

    Error Exponents andα-Mutual Information,

    S. Verd ´u, “Error Exponents andα-Mutual Information,”Entropy, vol. 23, no. 2, p. 199, 2021

  24. [24]

    Exact Exponent for Soft Covering,

    S. Yagli and P. Cuff, “Exact Exponent for Soft Covering,”IEEE Transactions on Information Theory, vol. 65, no. 10, pp. 6234–6262, 2019. April 14, 2026 DRAFT 48

  25. [25]

    An Operational Approach to Information Leakage,

    I. Issa, A. B. Wagner, and S. Kamath, “An Operational Approach to Information Leakage,”IEEE Transactions on Information Theory, vol. 66, no. 3, pp. 1625–1657, 2019

  26. [26]

    Tunable Measures for Information Leakage and Applications to Privacy-Utility Tradeoffs,

    J. Liao, O. Kosut, L. Sankar, and F. du Pin Calmon, “Tunable Measures for Information Leakage and Applications to Privacy-Utility Tradeoffs,”IEEE Transactions on Information Theory, vol. 65, no. 12, pp. 8043–8066, 2019

  27. [27]

    Information-Theoretic Lower Bounds on Bayes Risk in Decentralized Estimation,

    A. Xu and M. Raginsky, “Information-Theoretic Lower Bounds on Bayes Risk in Decentralized Estimation,”IEEE Transactions on Information Theory, vol. 63, no. 3, pp. 1580–1600, 2016

  28. [28]

    Lower-Bounds on the Bayesian Risk in Estimation Procedures via Sibson’sα-Mutual Information,

    A. R. Esposito and M. Gastpar, “Lower-Bounds on the Bayesian Risk in Estimation Procedures via Sibson’sα-Mutual Information,” in 2021 IEEE International Symposium on Information Theory (ISIT), 2021, pp. 748–753

  29. [29]

    Alpha-NML Universal Predictors,

    M. Bondaschi and M. Gastpar, “Alpha-NML Universal Predictors,”IEEE Transactions on Information Theory, vol. 71, no. 2, pp. 1171–1183, 2025

  30. [30]

    Convexity/Concavity of R´enyi Entropy andα-Mutual Information,

    S.-W. Ho and S. Verd ´u’, “Convexity/Concavity of R´enyi Entropy andα-Mutual Information,” in2015 IEEE International Symposium on Information Theory (ISIT), 2015, pp. 745–749

  31. [31]

    Alternating Optimization Approach for Computingα-Mutual Information andα-Capacity,

    A. Kamatsuka, K. Kazama, and T. Yoshida, “Alternating Optimization Approach for Computingα-Mutual Information andα-Capacity,” in2025 IEEE International Symposium on Information Theory (ISIT), 2025, pp. 1–6

  32. [32]

    Conditional R´enyi Divergence Saddlepoint and the Maximization ofα-Mutual Information,

    C. Cai and S. Verd ´u, “Conditional R´enyi Divergence Saddlepoint and the Maximization ofα-Mutual Information,”Entropy, vol. 21, no. 10, p. 969, 2019

  33. [33]

    Functional Properties of Minimum Mean-Square Error and Mutual Information,

    Y . Wu and S. Verd ´u, “Functional Properties of Minimum Mean-Square Error and Mutual Information,”IEEE Transactions on Information Theory, vol. 58, no. 3, pp. 1289–1301, 2012

  34. [34]

    Estimation in Gaussian Noise: Properties of the Minimum Mean-Square Error,

    D. Guo, Y . Wu, S. Shamai (Shitz), and S. Verd ´u, “Estimation in Gaussian Noise: Properties of the Minimum Mean-Square Error,”IEEE Transactions on Information Theory, vol. 57, no. 4, pp. 2371–2385, 2011

  35. [35]

    The Interplay between Information and Estimation Measures,

    D. Guo, S. Shamai (Shitz), and S. Verd ´u, “The Interplay between Information and Estimation Measures,”Foundations and Trends in Signal Processing, vol. 6, no. 4, pp. 243–429, 2012

  36. [36]

    A View of Information-Estimation Relations in Gaussian Networks,

    A. Dytso, R. Bustin, H. V . Poor, and S. Shamai (Shitz), “A View of Information-Estimation Relations in Gaussian Networks,”Entropy, vol. 19, no. 8, p. 409, 2017

  37. [37]

    On Classical Analogues of Free Entropy Dimension,

    A. Guionnet and D. Shlyakhtenko, “On Classical Analogues of Free Entropy Dimension,”Journal of Functional Analysis, vol. 251, no. 2, pp. 738–771, 2007

  38. [38]

    MMSE Dimension,

    Y . Wu and S. Verd ´u, “MMSE Dimension,”IEEE Transactions on Information Theory, vol. 57, no. 8, pp. 4857–4879, 2011

  39. [39]

    Entropic Isoperimetric and Cram ´er–Rao Inequalities for R ´enyi–Fisher Information,

    H. Wu and L. Yu, “Entropic Isoperimetric and Cram ´er–Rao Inequalities for R ´enyi–Fisher Information,”IEEE Transactions on Information Theory, 2025

  40. [40]

    On R ´enyi Entropy Power Inequalities,

    E. Ram and I. Sason, “On R ´enyi Entropy Power Inequalities,”IEEE Transactions on Information Theory, vol. 62, no. 12, pp. 6800–6815, 2016

  41. [41]

    R ´enyi Divergence and Kullback-Leibler Divergence,

    T. Van Erven and P. Harremos, “R ´enyi Divergence and Kullback-Leibler Divergence,”IEEE Transactions on Information Theory, vol. 60, no. 7, pp. 3797–3820, 2014

  42. [42]

    f-divergence inequalities,

    I. Sason and S. Verd ´u, “f-divergence inequalities,”IEEE Transactions on Information Theory, vol. 62, no. 11, pp. 5973–6006, 2016

  43. [43]

    R ´enyi,Probability Theory

    A. R ´enyi,Probability Theory. Mineola, N.Y .: Dover Publications, 2007, unabridged republication of the work published by North-Holland Publishing Company, Amsterdam, 1970

  44. [44]

    On the Dimension and Entropy of Orderαof the Mixture of Probability Distributions,

    I. Csisz ´ar, “On the Dimension and Entropy of Orderαof the Mixture of Probability Distributions,”Acta Mathematica Hungarica, vol. 13, no. 3-4, pp. 245–255, 1962

  45. [45]

    R´enyi Information Dimension: Fundamental Limits of Almost Lossless Analog Compression,

    Y . Wu and S. Verd ´u, “R´enyi Information Dimension: Fundamental Limits of Almost Lossless Analog Compression,”IEEE Transactions on Information Theory, vol. 56, no. 8, pp. 3721–3748, 2010

  46. [46]

    Information Dimension and the Degrees of Freedom of the Interference Channel,

    Y . Wu, S. Shamai (Shitz), and S. Verd ´u, “Information Dimension and the Degrees of Freedom of the Interference Channel,”IEEE Transactions on Information Theory, vol. 61, no. 1, pp. 256–279, 2015

  47. [47]

    The Information Capacity of Amplitude-and Variance-Constrained Scalar Gaussian Channels,

    J. G. Smith, “The Information Capacity of Amplitude-and Variance-Constrained Scalar Gaussian Channels,”Information and Control, vol. 18, no. 3, pp. 203–219, 1971

  48. [48]

    When Are Discrete Channel Inputs Optimal?—Optimization Techniques and Some New Results,

    A. Dytso, M. Goldenbaum, H. V . Poor, and S. Shamai (Shitz), “When Are Discrete Channel Inputs Optimal?—Optimization Techniques and Some New Results,” in2018 52nd Annual Conference on Information Sciences and Systems (CISS), 2018, pp. 1–6

  49. [49]

    Capacity-Achieving Input Distributions of Additive Vector Gaussian Noise Channels: Even- Moment Constraints and Unbounded or Compact Support,

    J. Eisen, R. R. Mazumdar, and P. Mitran, “Capacity-Achieving Input Distributions of Additive Vector Gaussian Noise Channels: Even- Moment Constraints and Unbounded or Compact Support,”Entropy, vol. 25, no. 8, p. 1180, 2023. April 14, 2026 DRAFT 49

  50. [50]

    M. S. Pinsker,Information and Information Stability of Random Variables and Processes, ser. Holden-Day series in time series analysis. San Francisco: Holden-Day, Inc., 1964

  51. [51]

    R. B. Ash,Information Theory. Dover Publications, 1990, originally published in 1965

  52. [52]

    An Empirical Bayes Approach to Statistics,

    H. Robbins, “An Empirical Bayes Approach to Statistics,” inProceedings Third Berkeley Symposium on Mathematical Statistics and Probabily. Citeseer, 1956

  53. [53]

    Some Geometric Properties of the Likelihood Ratio (corresp.),

    C. Hatsell and L. Nolte, “Some Geometric Properties of the Likelihood Ratio (corresp.),”IEEE Transactions on Information Theory, vol. 17, no. 5, pp. 616–618, 1971

  54. [54]

    Conditional Mean Estimation in Gaussian Noise: A Meta Derivative Identity with Applications,

    A. Dytso, H. V . Poor, and S. Shamai (Shitz), “Conditional Mean Estimation in Gaussian Noise: A Meta Derivative Identity with Applications,”IEEE Transactions on Information Theory, vol. 69, no. 3, pp. 1883–1898, 2022

  55. [55]

    Admissible Estimators, Recurrent Diffusions, and Insoluble Boundary Value Problems,

    L. D. Brown, “Admissible Estimators, Recurrent Diffusions, and Insoluble Boundary Value Problems,”The Annals of Mathematical Statistics, vol. 42, no. 3, pp. 855–903, 1971

  56. [56]

    Institute of Mathematical Statistics Lecture Notes—Monograph Series

    ——,Fundamentals of Statistical Exponential Families: With Applications in Statistical Decision Theory, ser. Institute of Mathematical Statistics Lecture Notes—Monograph Series. Hayward, CA: Institute of Mathematical Statistics, 1986, vol. 9

  57. [57]

    Some Inequalities Satisfied by the Quantities of Information of Fisher and Shannon,

    A. J. Stam, “Some Inequalities Satisfied by the Quantities of Information of Fisher and Shannon,”Information and Control, vol. 2, no. 2, pp. 101–112, 1959

  58. [58]

    On Channel Capacity per Unit Cost,

    S. Verd ´u, “On Channel Capacity per Unit Cost,”IEEE Transactions on Information Theory, vol. 36, no. 5, pp. 1019–1030, 1990

  59. [59]

    Fading Channels: How Perfect Need “Perfect Side Information

    A. Lapidoth and S. Shamai (Shitz), “Fading Channels: How Perfect Need “Perfect Side Information” Be?”IEEE Transactions on Information Theory, vol. 48, no. 5, pp. 1118–1134, 2002

  60. [60]

    Spectral Efficiency in the Wideband Regime,

    S. Verd ´u, “Spectral Efficiency in the Wideband Regime,”IEEE Transactions on Information Theory, vol. 48, no. 6, pp. 1319–1343, 2002

  61. [61]

    A Simple Proof of the Entropy-Power Inequality,

    D. Guo, “A Simple Proof of the Entropy-Power Inequality,”IEEE Transactions on Information Theory, vol. 52, no. 5, pp. 2165–2166, 2006

  62. [62]

    Monotonic Decrease of the non-Gaussianness of the Sum of Independent Random Variables: A Simple Proof,

    A. M. Tulino and S. Verd ´u, “Monotonic Decrease of the non-Gaussianness of the Sum of Independent Random Variables: A Simple Proof,”IEEE Transactions on Information Theory, vol. 52, no. 9, pp. 4295–4297, 2006

  63. [63]

    Proof of Entropy Power Inequalities via MMSE,

    D. Guo, S. Shamai (Shitz), and S. Verd ´u, “Proof of Entropy Power Inequalities via MMSE,” in2006 IEEE International Symposium on Information Theory, 2006, pp. 1011–1015

  64. [64]

    R ´enyi Entropy Dimension of the Mixture of Measures,

    M. ´Smieja and J. Tabor, “R ´enyi Entropy Dimension of the Mixture of Measures,” in2014 Science and Information Conference, 2014, pp. 685–689

  65. [65]

    Concentration of Measure Inequalities in Information Theory, Communications, and Coding,

    M. Raginsky and I. Sason, “Concentration of Measure Inequalities in Information Theory, Communications, and Coding,”Foundations and Trends in Communications and Information Theory, vol. 10, no. 1-2, pp. 1–247, 2013

  66. [66]

    Mutual Information as a Function of Matrix SNR for Linear Gaussian Channels,

    G. Reeves, H. D. Pfister, and A. Dytso, “Mutual Information as a Function of Matrix SNR for Linear Gaussian Channels,” in2018 IEEE International Symposium on Information Theory (ISIT), 2018, pp. 1754–1758

  67. [67]

    An MMSE Approach to the Secrecy Capacity of the MIMO Gaussian Wiretap Channel,

    R. Bustin, R. Liu, H. V . Poor, and S. Shamai (Shitz), “An MMSE Approach to the Secrecy Capacity of the MIMO Gaussian Wiretap Channel,”EURASIP Journal on Wireless Communications and Networking, vol. 2009, no. 1, p. 370970, 2009

  68. [68]

    Optimum Power Allocation for Parallel Gaussian Channels with Arbitrary Input Distributions,

    A. Lozano, A. M. Tulino, and S. Verd ´u, “Optimum Power Allocation for Parallel Gaussian Channels with Arbitrary Input Distributions,” IEEE Transactions on Information Theory, vol. 52, no. 7, pp. 3033–3051, 2006

  69. [69]

    The Rate-Distortion Dimension of Sets and Measures,

    T. Kawabata and A. Dembo, “The Rate-Distortion Dimension of Sets and Measures,”IEEE Transactions on Information Theory, vol. 40, no. 5, pp. 1564–1572, 1994

  70. [70]

    The Capacity Achieving Distribution for the Amplitude Constrained Additive Gaussian Channel: An Upper Bound on the Number of Mass Points,

    A. Dytso, S. Yagli, H. V . Poor, and S. Shamai (Shitz), “The Capacity Achieving Distribution for the Amplitude Constrained Additive Gaussian Channel: An Upper Bound on the Number of Mass Points,”IEEE Transactions on Information Theory, vol. 66, no. 4, pp. 2006–2022, 2019

  71. [71]

    Degrees of freedom in vector interference channels,

    D. Stotz and H. B ¨olcskei, “Degrees of freedom in vector interference channels,”IEEE Transactions on Information Theory, vol. 62, no. 7, pp. 4172–4197, 2016

  72. [72]

    Mutual Information and Conditional Mean Estimation in Poisson Channels,

    D. Guo, S. Shamai (Shitz), and S. Verd ´u, “Mutual Information and Conditional Mean Estimation in Poisson Channels,”IEEE Transactions on Information Theory, vol. 54, no. 5, pp. 1837–1849, 2008

  73. [73]

    Mutual Information, Relative Entropy, and Estimation in the Poisson Channel,

    R. Atar and T. Weissman, “Mutual Information, Relative Entropy, and Estimation in the Poisson Channel,”IEEE Transactions on Information Theory, vol. 58, no. 3, pp. 1302–1318, 2012

  74. [74]

    Relations Between Information and Estimation in Discrete-Time L ´evy Channels,

    J. Jiao, K. Venkat, and T. Weissman, “Relations Between Information and Estimation in Discrete-Time L ´evy Channels,”IEEE Transactions on Information Theory, vol. 63, no. 6, pp. 3579–3594, 2017

  75. [75]

    R. M. Dudley,Real Analysis and Probability. Chapman and Hall/CRC, 2018. April 14, 2026 DRAFT 50

  76. [76]

    G. B. Folland,Real Analysis: Modern Techniques and Their Applications. John Wiley & Sons, 1999. April 14, 2026 DRAFT