$\alpha$-Mutual Information for the Gaussian Noise Channel

Alex Dytso; Martina Cardone; Mohammad Milanian

arxiv: 2604.10922 · v1 · submitted 2026-04-13 · 💻 cs.IT · math.IT· math.ST· stat.TH

α-Mutual Information for the Gaussian Noise Channel

Mohammad Milanian , Alex Dytso , Martina Cardone This is my paper

Pith reviewed 2026-05-10 16:06 UTC · model grok-4.3

classification 💻 cs.IT math.ITmath.STstat.TH

keywords Sibson's α-mutual informationGaussian noise channelI-MMSE relationRényi entropyde Bruijn identitytilted distributionssignal-to-noise ratioinformation dimension

0 comments

The pith

The derivative of α-mutual information with respect to SNR equals the MMSE under α-tilted distributions for the Gaussian noise channel.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines Sibson's α-mutual information on the additive Gaussian noise channel and develops a set of structural results that parallel those known for ordinary mutual information. It proves finiteness and continuity conditions, establishes strict convexity and concavity properties, and derives an α-I-MMSE identity that expresses the SNR derivative of α-mutual information through the minimum mean-square error computed on suitably tilted input distributions. A reader would care because the identity recovers the classical I-MMSE relation when α equals one and simultaneously supplies new estimation-based expressions for Rényi entropy and a generalized de Bruijn identity. The work further gives explicit low-SNR expansions that depend only on input variance and high-SNR limits that relate α-mutual information to Rényi entropy or α-information dimension.

Core claim

The central claim is that an α-I-MMSE relationship holds for the Gaussian channel: the derivative of Sibson's α-mutual information with respect to signal-to-noise ratio equals the minimum mean-square error evaluated under the corresponding α-tilted distributions. This identity implies a generalized de Bruijn identity and yields estimation-theoretic representations of Rényi entropy and differential Rényi entropy. In addition, α-mutual information admits a low-SNR expansion determined solely by input variance and, at high SNR, converges to the Rényi entropy of order 1/α for discrete inputs or connects to α-information dimension for general inputs.

What carries the argument

The α-I-MMSE relationship, which equates the derivative of α-mutual information with respect to SNR to the MMSE under α-tilted distributions.

If this is right

The α-I-MMSE identity supplies a direct way to obtain the SNR derivative of α-mutual information from an estimation quantity.
Low-SNR α-mutual information depends only on the variance of the input.
For discrete inputs the high-SNR limit of α-mutual information equals the Rényi entropy of order 1/α.
Rényi entropy and differential Rényi entropy admit new representations in terms of estimation errors under tilted distributions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The tilted-distribution construction may permit numerical evaluation of α-mutual information by reusing existing MMSE estimators.
Strict concavity properties could guarantee uniqueness in optimization problems that maximize or minimize α-mutual information.
The same regularity framework might be testable on other additive noise channels to check whether the α-I-MMSE relation survives beyond the Gaussian case.

Load-bearing premise

The assumption that α-mutual information remains finite and differentiable with respect to SNR for the input distributions and α values of interest.

What would settle it

A specific input distribution and α value for which the derivative of α-mutual information with respect to SNR fails to equal the MMSE computed under the corresponding tilted distribution.

Figures

Figures reproduced from arXiv: 2604.10922 by Alex Dytso, Martina Cardone, Mohammad Milanian.

**Figure 1.** Figure 1: Iα(X; snr) and the corresponding α-MMSE mmseα(X; snr) versus snr and different values of α. B. α-I-MMSE Relationship We here present an expression that connects α-mutual information and MMSE. In particular, this result shows that the rate of α-mutual information increase as SNR increases is equal to a fraction α/2 of the MMSE achieved by the optimal estimator of Xα given Yα. The main result, which generali… view at source ↗

**Figure 2.** Figure 2: Iα(X; snr) and the corresponding α-MMSE mmseα(X; snr) versus α and different values of snr. C. On Generalized de Bruijn’s Identity The classical I-MMSE relationship is known to be equivalent to the de Bruijn’s identity [57], which relates the derivative of the Shannon differential entropy to the Fisher information. This connection is typically established via Brown’s identity. Equipped with our generalizat… view at source ↗

read the original abstract

In this paper, we study Sibson's $\alpha$-mutual information in the context of the additive Gaussian noise channel. While the classical case $\alpha = 1$ is well understood and admits deep connections to estimation-theoretic quantities, such as the minimum mean-square error (MMSE) and Fisher information, many of the corresponding structural properties for general $\alpha$ remain less explored. Our goal is to develop a systematic understanding of $\alpha$-mutual information in the Gaussian noise setting and to identify which properties extend beyond the Shannon case. To this end, we establish several regularity properties, including finiteness conditions, continuity with respect to the signal-to-noise ratio (SNR) and the input distribution, and strict concavity/convexity properties that ensure uniqueness in associated optimization problems. A central contribution is the development of an $\alpha$-I-MMSE relationship, generalizing the classical identity by relating the derivative of $\alpha$-mutual information with respect to SNR to the MMSE evaluated under appropriately tilted distributions. This connection further leads to a generalized de Bruijn identity and new estimation-theoretic representations of R\'enyi entropy and differential R\'enyi entropy. We also characterize the low- and high-SNR behavior. In the low-SNR regime, the first-order behavior depends only on the input variance. In the high-SNR regime, for discrete inputs, $\alpha$-mutual information converges to the R\'enyi entropy of order $1/\alpha$, while for general inputs we connect it to $\alpha$-information dimension. Overall, our results show that many fundamental relationships between information and estimation extend beyond the Shannon setting, in a form involving $\alpha$-tilted distributions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper extends the I-MMSE relation to α-mutual information via tilted distributions on the Gaussian channel and adds regularity properties plus low/high-SNR expansions, but the differentiability step for the derivative identity needs explicit checks for unbounded inputs.

read the letter

The main new piece is the α-I-MMSE relation: the derivative of α-mutual information with respect to SNR equals the MMSE computed under an α-tilted output distribution. From there they derive a generalized de Bruijn identity and estimation-based expressions for Rényi entropy and differential Rényi entropy. They also give the low-SNR expansion that depends only on input variance and the high-SNR limits that recover Rényi entropy for discrete inputs or α-information dimension for general ones. That is concrete and useful within the subfield. They do a solid job on the supporting regularity results: finiteness conditions, continuity in SNR and input distribution, and the convexity/concavity properties that guarantee uniqueness in the associated optimization problems. The Gaussian channel keeps the tilting explicit, which helps. The soft spot is the justification for differentiating under the integral to get the α-I-MMSE identity. The paper invokes regularity to ensure the tilted measures are well-defined, but for inputs with unbounded support the α-divergence or the normalizing constant for the tilt can fail even when the un-tilted α-MI stays finite. If the proofs only assume the conditions without verifying the interchange or giving counter-examples, that leaves a gap that a referee would want tightened. This is for information theorists already working on generalized mutual information or Rényi-based privacy and estimation. A reader who wants to see how the classical I-MMSE and de Bruijn connections survive the α-extension will find it worth reading. It is not broad enough to interest people outside that niche. It deserves a serious referee because the identities are plausible, the setting is clean, and the gaps are fixable rather than fatal. Send it out but ask the authors to spell out the precise conditions under which the derivative holds.

Referee Report

2 major / 3 minor

Summary. The paper studies Sibson's α-mutual information for the additive Gaussian noise channel. It establishes finiteness conditions, continuity in SNR and input distribution, and strict concavity/convexity properties. A central result is an α-I-MMSE identity relating the derivative of α-MI w.r.t. SNR to the MMSE under α-tilted output distributions; this yields a generalized de Bruijn identity and new representations of Rényi entropy. The work also characterizes low-SNR behavior (depending only on input variance) and high-SNR asymptotics (convergence to Rényi entropy of order 1/α for discrete inputs, or α-information dimension for general inputs).

Significance. If the regularity conditions and derivative identities hold rigorously, the α-I-MMSE relation and its consequences provide a systematic extension of classical information-estimation links (I-MMSE, de Bruijn) to the Rényi/α setting for Gaussian channels. The low- and high-SNR characterizations and convexity results are useful for optimization problems involving α-MI. The manuscript supplies concrete asymptotic expressions and tilted-distribution representations, which are falsifiable and potentially reusable.

major comments (2)

[§4 (α-I-MMSE theorem and proof)] The central α-I-MMSE identity (abstract and §4) requires differentiability of α-MI w.r.t. SNR and interchange of derivative with the integral representation of α-MI. The paper invokes finiteness, continuity, and regularity properties to justify this, but does not explicitly verify that the α-tilted measures remain valid probability distributions (i.e., the normalizing constant is finite and the density exists) for all inputs where α-MI is declared finite, particularly for unbounded-support continuous inputs at finite SNR. This is load-bearing for the derivative claim.
[§3 (regularity properties) and §4] The finiteness conditions for α-MI (stated in §3) are given, but the manuscript does not supply an explicit check or counter-example showing that these conditions automatically guarantee the existence of the tilted density and the validity of the derivative identity when the input has slow-decaying tails. Without this, the scope of the α-I-MMSE relation remains unclear.

minor comments (3)

[§4] The definition of the α-tilted output distribution (Eq. (X) in §4) should be written explicitly with the normalizing constant shown, to make the subsequent MMSE expression immediately verifiable.
[§5 (high-SNR asymptotics)] Notation for the α-MI functional and the tilting parameter should be introduced once and used consistently; occasional reuse of I_α without the channel subscript creates minor ambiguity in the high-SNR section.
[§6] The low-SNR expansion (first-order term depending only on variance) is stated cleanly, but a short remark on whether higher-order terms involve higher moments would help readers compare with the classical I-MMSE expansion.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review. The comments correctly identify points where the exposition of regularity conditions and the justification for the derivative identity can be strengthened. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [§4 (α-I-MMSE theorem and proof)] The central α-I-MMSE identity (abstract and §4) requires differentiability of α-MI w.r.t. SNR and interchange of derivative with the integral representation of α-MI. The paper invokes finiteness, continuity, and regularity properties to justify this, but does not explicitly verify that the α-tilted measures remain valid probability distributions (i.e., the normalizing constant is finite and the density exists) for all inputs where α-MI is declared finite, particularly for unbounded-support continuous inputs at finite SNR. This is load-bearing for the derivative claim.

Authors: We agree that an explicit verification would strengthen the argument. The finiteness of α-MI is defined through the finiteness of the integral appearing in its expression, which is precisely the normalizing constant of the α-tilted output measure; hence the tilted object is a probability distribution whenever α-MI is finite. Because the channel is Gaussian, absolute continuity with respect to Lebesgue measure is preserved under the tilting operation for any input distribution that induces a well-defined output density. For inputs with unbounded support the moment conditions implicit in the finiteness statement of §3 already control the tails sufficiently for the dominated-convergence argument used to interchange differentiation and integration. In the revision we will insert a short lemma (or dedicated remark) immediately preceding the α-I-MMSE theorem that states and proves these facts, thereby making the scope of the identity fully explicit. revision: yes
Referee: [§3 (regularity properties) and §4] The finiteness conditions for α-MI (stated in §3) are given, but the manuscript does not supply an explicit check or counter-example showing that these conditions automatically guarantee the existence of the tilted density and the validity of the derivative identity when the input has slow-decaying tails. Without this, the scope of the α-I-MMSE relation remains unclear.

Authors: The finiteness conditions listed in §3 are formulated exactly so that the relevant integrals remain finite after the Gaussian convolution and the subsequent α-tilting; the Gaussian kernel supplies enough smoothing that slow polynomial decay of the input tails does not prevent the tilted density from existing. We therefore do not expect counter-examples inside the regime where α-MI is declared finite. To remove any ambiguity we will add, in the revised §3, a brief paragraph together with a concrete illustration (e.g., a Student-t input with degrees of freedom chosen so that α-MI remains finite) confirming that the tilted density exists and that the derivative identity continues to hold. This addition will clarify the applicability of the α-I-MMSE relation without altering the stated theorems. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the α-I-MMSE derivation

full rationale

The paper establishes finiteness, continuity, and convexity/concavity properties of α-mutual information independently before deriving the α-I-MMSE identity as a consequence of differentiating the α-MI functional with respect to SNR and relating it to MMSE under α-tilted distributions constructed directly from the input and Gaussian channel law. No quoted step reduces the central claim to a self-definition, a fitted parameter renamed as prediction, or a load-bearing self-citation chain. The derivation is presented as extending the classical I-MMSE relation via explicit tilted measures rather than by tautology or renaming. The provided abstract and context show a self-contained chain against the definitions of α-MI and the channel.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on standard measure-theoretic assumptions for mutual information to be finite, plus new definitions of α-tilted distributions whose validity is asserted under finiteness conditions stated in the abstract.

axioms (2)

domain assumption α-mutual information is finite and differentiable with respect to SNR under the stated regularity conditions
Invoked to justify the derivative identities and continuity claims
standard math Standard properties of Rényi entropy and differential entropy carry over to the α-tilted measures
Used for the generalized de Bruijn identity and high-SNR limits

pith-pipeline@v0.9.0 · 5617 in / 1618 out tokens · 63069 ms · 2026-05-10T16:06:04.329986+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

76 extracted references · 76 canonical work pages

[1]

On Measures of Entropy and Information,

A. R ´enyi, “On Measures of Entropy and Information,” inProceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. University of California Press, 1961, pp. 547–562

work page 1961
[2]

Cumulant Generating Function of Codeword Lengths in Optimal Lossless Compression,

T. A. Courtade and S. Verd ´u, “Cumulant Generating Function of Codeword Lengths in Optimal Lossless Compression,” in2014 IEEE International Symposium on Information Theory, 2014, pp. 2494–2498

work page 2014
[3]

R ´enyi’s Entropy and the Probability of Error,

M. Ben-Bassat and J. Raviv, “R ´enyi’s Entropy and the Probability of Error,”IEEE Transactions on Information Theory, vol. 24, no. 3, pp. 324–331, 2003

work page 2003
[4]

Arimoto-R´enyi Conditional Entropy and Bayesianm-ary Hypothesis Testing,

I. Sason and S. Verd ´u, “Arimoto-R´enyi Conditional Entropy and Bayesianm-ary Hypothesis Testing,”IEEE Transactions on Information theory, vol. 64, no. 1, pp. 4–25, 2017

work page 2017
[5]

An Inequality on Guessing and its Application to Sequential Decoding,

E. Arikan, “An Inequality on Guessing and its Application to Sequential Decoding,”IEEE Transactions on Information Theory, vol. 42, no. 1, pp. 99–105, 2002

work page 2002
[6]

A Primer on Alpha-Information Theory with Application to Leakage in Secrecy Systems,

O. Rioul, “A Primer on Alpha-Information Theory with Application to Leakage in Secrecy Systems,” inInternational Conference on Geometric Science of Information. Springer, 2021, pp. 459–467

work page 2021
[7]

Information Measures and Capacity of Orderαfor Discrete Memoryless Channels,

S. Arimoto, “Information Measures and Capacity of Orderαfor Discrete Memoryless Channels,”Topics in Information Theory, 1977

work page 1977
[8]

Information Radius,

R. Sibson, “Information Radius,”Zeitschrift f ¨ur Wahrscheinlichkeitstheorie und verwandte Gebiete, vol. 14, no. 2, pp. 149–160, 1969

work page 1969
[9]

Generalized Cutoff Rates and R ´enyi’s Information Measures,

I. Csisz ´ar, “Generalized Cutoff Rates and R ´enyi’s Information Measures,”IEEE Transactions on information theory, vol. 41, no. 1, pp. 26–34, 2002

work page 2002
[10]

Noisy Channels,

U. Augustin, “Noisy Channels,” Ph.D. dissertation, Universit ¨at Erlangen–N ¨urnberg, 1978, Habilitation Thesis

work page 1978
[11]

Two Measures of Dependence,

A. Lapidoth and C. Pfister, “Two Measures of Dependence,”Entropy, vol. 21, no. 8, 2019

work page 2019
[12]

Sibsonα-Mutual Information and Its Variational Representations,

A. R. Esposito, M. Gastpar, and I. Issa, “Sibsonα-Mutual Information and Its Variational Representations,”IEEE Transactions on Information Theory, pp. 1–1, 2025

work page 2025
[13]

Arimoto Channel Coding Converse and R ´enyi Divergence,

Y . Polyanskiy and S. Verd ´u, “Arimoto Channel Coding Converse and R ´enyi Divergence,” in2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2010, pp. 1327–1333

work page 2010
[14]

Generalization Error Bounds via R ´enyi-, f-Divergences and Maximal Leakage,

A. R. Esposito, M. Gastpar, and I. Issa, “Generalization Error Bounds via R ´enyi-, f-Divergences and Maximal Leakage,”IEEE Transactions on Information Theory, vol. 67, no. 8, pp. 4986–5004, 2021

work page 2021
[15]

On Meta-Bound for Lower Bounds of Bayes Risk,

S. Saito, “On Meta-Bound for Lower Bounds of Bayes Risk,” in2022 IEEE International Symposium on Information Theory (ISIT), 2022, pp. 3162–3167

work page 2022
[16]

Mutual Information and Minimum Mean-Square Error in Gaussian Channels,

D. Guo, S. Shamai (Shitz), and S. Verd ´u, “Mutual Information and Minimum Mean-Square Error in Gaussian Channels,”IEEE Transactions on Information Theory, vol. 51, no. 4, pp. 1261–1282, 2005

work page 2005
[17]

Derivative of Mutual Information at Zero SNR: The Gaussian-Noise Case,

Y . Wu, D. Guo, and S. Verd ´u, “Derivative of Mutual Information at Zero SNR: The Gaussian-Noise Case,”IEEE Transactions on Information Theory, vol. 57, no. 11, pp. 7307–7312, 2011

work page 2011
[18]

α-Mutual Information,

S. Verd ´u, “α-Mutual Information,” in2015 Information Theory and Applications Workshop (ITA), 2015, pp. 1–6

work page 2015
[19]

The Zero Error Capacity of a Noisy Channel,

C. Shannon, “The Zero Error Capacity of a Noisy Channel,”IRE Transactions on Information Theory, vol. 2, no. 3, pp. 8–19, 1956

work page 1956
[20]

A Simple Derivation of the Coding Theorem and Some Applications,

R. Gallager, “A Simple Derivation of the Coding Theorem and Some Applications,”IEEE Transactions on Information Theory, vol. 11, no. 1, pp. 3–18, 1965

work page 1965
[21]

Variable-Length Lossy Compression and Channel Coding: Non-Asymptotic Converses via Cumulant Generating Functions,

T. A. Courtade and S. Verd ´u, “Variable-Length Lossy Compression and Channel Coding: Non-Asymptotic Converses via Cumulant Generating Functions,” in2014 IEEE International Symposium on Information Theory, 2014, pp. 2499–2503

work page 2014
[22]

On the Converse to the Coding Theorem for Discrete Memoryless Channels (corresp.),

S. Arimoto, “On the Converse to the Coding Theorem for Discrete Memoryless Channels (corresp.),”IEEE Transactions on Information Theory, vol. 19, no. 3, pp. 357–359, 1973

work page 1973
[23]

Error Exponents andα-Mutual Information,

S. Verd ´u, “Error Exponents andα-Mutual Information,”Entropy, vol. 23, no. 2, p. 199, 2021

work page 2021
[24]

Exact Exponent for Soft Covering,

S. Yagli and P. Cuff, “Exact Exponent for Soft Covering,”IEEE Transactions on Information Theory, vol. 65, no. 10, pp. 6234–6262, 2019. April 14, 2026 DRAFT 48

work page 2019
[25]

An Operational Approach to Information Leakage,

I. Issa, A. B. Wagner, and S. Kamath, “An Operational Approach to Information Leakage,”IEEE Transactions on Information Theory, vol. 66, no. 3, pp. 1625–1657, 2019

work page 2019
[26]

Tunable Measures for Information Leakage and Applications to Privacy-Utility Tradeoffs,

J. Liao, O. Kosut, L. Sankar, and F. du Pin Calmon, “Tunable Measures for Information Leakage and Applications to Privacy-Utility Tradeoffs,”IEEE Transactions on Information Theory, vol. 65, no. 12, pp. 8043–8066, 2019

work page 2019
[27]

Information-Theoretic Lower Bounds on Bayes Risk in Decentralized Estimation,

A. Xu and M. Raginsky, “Information-Theoretic Lower Bounds on Bayes Risk in Decentralized Estimation,”IEEE Transactions on Information Theory, vol. 63, no. 3, pp. 1580–1600, 2016

work page 2016
[28]

Lower-Bounds on the Bayesian Risk in Estimation Procedures via Sibson’sα-Mutual Information,

A. R. Esposito and M. Gastpar, “Lower-Bounds on the Bayesian Risk in Estimation Procedures via Sibson’sα-Mutual Information,” in 2021 IEEE International Symposium on Information Theory (ISIT), 2021, pp. 748–753

work page 2021
[29]

Alpha-NML Universal Predictors,

M. Bondaschi and M. Gastpar, “Alpha-NML Universal Predictors,”IEEE Transactions on Information Theory, vol. 71, no. 2, pp. 1171–1183, 2025

work page 2025
[30]

Convexity/Concavity of R´enyi Entropy andα-Mutual Information,

S.-W. Ho and S. Verd ´u’, “Convexity/Concavity of R´enyi Entropy andα-Mutual Information,” in2015 IEEE International Symposium on Information Theory (ISIT), 2015, pp. 745–749

work page 2015
[31]

Alternating Optimization Approach for Computingα-Mutual Information andα-Capacity,

A. Kamatsuka, K. Kazama, and T. Yoshida, “Alternating Optimization Approach for Computingα-Mutual Information andα-Capacity,” in2025 IEEE International Symposium on Information Theory (ISIT), 2025, pp. 1–6

work page 2025
[32]

Conditional R´enyi Divergence Saddlepoint and the Maximization ofα-Mutual Information,

C. Cai and S. Verd ´u, “Conditional R´enyi Divergence Saddlepoint and the Maximization ofα-Mutual Information,”Entropy, vol. 21, no. 10, p. 969, 2019

work page 2019
[33]

Functional Properties of Minimum Mean-Square Error and Mutual Information,

Y . Wu and S. Verd ´u, “Functional Properties of Minimum Mean-Square Error and Mutual Information,”IEEE Transactions on Information Theory, vol. 58, no. 3, pp. 1289–1301, 2012

work page 2012
[34]

Estimation in Gaussian Noise: Properties of the Minimum Mean-Square Error,

D. Guo, Y . Wu, S. Shamai (Shitz), and S. Verd ´u, “Estimation in Gaussian Noise: Properties of the Minimum Mean-Square Error,”IEEE Transactions on Information Theory, vol. 57, no. 4, pp. 2371–2385, 2011

work page 2011
[35]

The Interplay between Information and Estimation Measures,

D. Guo, S. Shamai (Shitz), and S. Verd ´u, “The Interplay between Information and Estimation Measures,”Foundations and Trends in Signal Processing, vol. 6, no. 4, pp. 243–429, 2012

work page 2012
[36]

A View of Information-Estimation Relations in Gaussian Networks,

A. Dytso, R. Bustin, H. V . Poor, and S. Shamai (Shitz), “A View of Information-Estimation Relations in Gaussian Networks,”Entropy, vol. 19, no. 8, p. 409, 2017

work page 2017
[37]

On Classical Analogues of Free Entropy Dimension,

A. Guionnet and D. Shlyakhtenko, “On Classical Analogues of Free Entropy Dimension,”Journal of Functional Analysis, vol. 251, no. 2, pp. 738–771, 2007

work page 2007
[38]

MMSE Dimension,

Y . Wu and S. Verd ´u, “MMSE Dimension,”IEEE Transactions on Information Theory, vol. 57, no. 8, pp. 4857–4879, 2011

work page 2011
[39]

Entropic Isoperimetric and Cram ´er–Rao Inequalities for R ´enyi–Fisher Information,

H. Wu and L. Yu, “Entropic Isoperimetric and Cram ´er–Rao Inequalities for R ´enyi–Fisher Information,”IEEE Transactions on Information Theory, 2025

work page 2025
[40]

On R ´enyi Entropy Power Inequalities,

E. Ram and I. Sason, “On R ´enyi Entropy Power Inequalities,”IEEE Transactions on Information Theory, vol. 62, no. 12, pp. 6800–6815, 2016

work page 2016
[41]

R ´enyi Divergence and Kullback-Leibler Divergence,

T. Van Erven and P. Harremos, “R ´enyi Divergence and Kullback-Leibler Divergence,”IEEE Transactions on Information Theory, vol. 60, no. 7, pp. 3797–3820, 2014

work page 2014
[42]

f-divergence inequalities,

I. Sason and S. Verd ´u, “f-divergence inequalities,”IEEE Transactions on Information Theory, vol. 62, no. 11, pp. 5973–6006, 2016

work page 2016
[43]

R ´enyi,Probability Theory

A. R ´enyi,Probability Theory. Mineola, N.Y .: Dover Publications, 2007, unabridged republication of the work published by North-Holland Publishing Company, Amsterdam, 1970

work page 2007
[44]

On the Dimension and Entropy of Orderαof the Mixture of Probability Distributions,

I. Csisz ´ar, “On the Dimension and Entropy of Orderαof the Mixture of Probability Distributions,”Acta Mathematica Hungarica, vol. 13, no. 3-4, pp. 245–255, 1962

work page 1962
[45]

R´enyi Information Dimension: Fundamental Limits of Almost Lossless Analog Compression,

Y . Wu and S. Verd ´u, “R´enyi Information Dimension: Fundamental Limits of Almost Lossless Analog Compression,”IEEE Transactions on Information Theory, vol. 56, no. 8, pp. 3721–3748, 2010

work page 2010
[46]

Information Dimension and the Degrees of Freedom of the Interference Channel,

Y . Wu, S. Shamai (Shitz), and S. Verd ´u, “Information Dimension and the Degrees of Freedom of the Interference Channel,”IEEE Transactions on Information Theory, vol. 61, no. 1, pp. 256–279, 2015

work page 2015
[47]

The Information Capacity of Amplitude-and Variance-Constrained Scalar Gaussian Channels,

J. G. Smith, “The Information Capacity of Amplitude-and Variance-Constrained Scalar Gaussian Channels,”Information and Control, vol. 18, no. 3, pp. 203–219, 1971

work page 1971
[48]

When Are Discrete Channel Inputs Optimal?—Optimization Techniques and Some New Results,

A. Dytso, M. Goldenbaum, H. V . Poor, and S. Shamai (Shitz), “When Are Discrete Channel Inputs Optimal?—Optimization Techniques and Some New Results,” in2018 52nd Annual Conference on Information Sciences and Systems (CISS), 2018, pp. 1–6

work page 2018
[49]

Capacity-Achieving Input Distributions of Additive Vector Gaussian Noise Channels: Even- Moment Constraints and Unbounded or Compact Support,

J. Eisen, R. R. Mazumdar, and P. Mitran, “Capacity-Achieving Input Distributions of Additive Vector Gaussian Noise Channels: Even- Moment Constraints and Unbounded or Compact Support,”Entropy, vol. 25, no. 8, p. 1180, 2023. April 14, 2026 DRAFT 49

work page 2023
[50]

M. S. Pinsker,Information and Information Stability of Random Variables and Processes, ser. Holden-Day series in time series analysis. San Francisco: Holden-Day, Inc., 1964

work page 1964
[51]

R. B. Ash,Information Theory. Dover Publications, 1990, originally published in 1965

work page 1990
[52]

An Empirical Bayes Approach to Statistics,

H. Robbins, “An Empirical Bayes Approach to Statistics,” inProceedings Third Berkeley Symposium on Mathematical Statistics and Probabily. Citeseer, 1956

work page 1956
[53]

Some Geometric Properties of the Likelihood Ratio (corresp.),

C. Hatsell and L. Nolte, “Some Geometric Properties of the Likelihood Ratio (corresp.),”IEEE Transactions on Information Theory, vol. 17, no. 5, pp. 616–618, 1971

work page 1971
[54]

Conditional Mean Estimation in Gaussian Noise: A Meta Derivative Identity with Applications,

A. Dytso, H. V . Poor, and S. Shamai (Shitz), “Conditional Mean Estimation in Gaussian Noise: A Meta Derivative Identity with Applications,”IEEE Transactions on Information Theory, vol. 69, no. 3, pp. 1883–1898, 2022

work page 2022
[55]

Admissible Estimators, Recurrent Diffusions, and Insoluble Boundary Value Problems,

L. D. Brown, “Admissible Estimators, Recurrent Diffusions, and Insoluble Boundary Value Problems,”The Annals of Mathematical Statistics, vol. 42, no. 3, pp. 855–903, 1971

work page 1971
[56]

Institute of Mathematical Statistics Lecture Notes—Monograph Series

——,Fundamentals of Statistical Exponential Families: With Applications in Statistical Decision Theory, ser. Institute of Mathematical Statistics Lecture Notes—Monograph Series. Hayward, CA: Institute of Mathematical Statistics, 1986, vol. 9

work page 1986
[57]

Some Inequalities Satisfied by the Quantities of Information of Fisher and Shannon,

A. J. Stam, “Some Inequalities Satisfied by the Quantities of Information of Fisher and Shannon,”Information and Control, vol. 2, no. 2, pp. 101–112, 1959

work page 1959
[58]

On Channel Capacity per Unit Cost,

S. Verd ´u, “On Channel Capacity per Unit Cost,”IEEE Transactions on Information Theory, vol. 36, no. 5, pp. 1019–1030, 1990

work page 1990
[59]

Fading Channels: How Perfect Need “Perfect Side Information

A. Lapidoth and S. Shamai (Shitz), “Fading Channels: How Perfect Need “Perfect Side Information” Be?”IEEE Transactions on Information Theory, vol. 48, no. 5, pp. 1118–1134, 2002

work page 2002
[60]

Spectral Efficiency in the Wideband Regime,

S. Verd ´u, “Spectral Efficiency in the Wideband Regime,”IEEE Transactions on Information Theory, vol. 48, no. 6, pp. 1319–1343, 2002

work page 2002
[61]

A Simple Proof of the Entropy-Power Inequality,

D. Guo, “A Simple Proof of the Entropy-Power Inequality,”IEEE Transactions on Information Theory, vol. 52, no. 5, pp. 2165–2166, 2006

work page 2006
[62]

Monotonic Decrease of the non-Gaussianness of the Sum of Independent Random Variables: A Simple Proof,

A. M. Tulino and S. Verd ´u, “Monotonic Decrease of the non-Gaussianness of the Sum of Independent Random Variables: A Simple Proof,”IEEE Transactions on Information Theory, vol. 52, no. 9, pp. 4295–4297, 2006

work page 2006
[63]

Proof of Entropy Power Inequalities via MMSE,

D. Guo, S. Shamai (Shitz), and S. Verd ´u, “Proof of Entropy Power Inequalities via MMSE,” in2006 IEEE International Symposium on Information Theory, 2006, pp. 1011–1015

work page 2006
[64]

R ´enyi Entropy Dimension of the Mixture of Measures,

M. ´Smieja and J. Tabor, “R ´enyi Entropy Dimension of the Mixture of Measures,” in2014 Science and Information Conference, 2014, pp. 685–689

work page 2014
[65]

Concentration of Measure Inequalities in Information Theory, Communications, and Coding,

M. Raginsky and I. Sason, “Concentration of Measure Inequalities in Information Theory, Communications, and Coding,”Foundations and Trends in Communications and Information Theory, vol. 10, no. 1-2, pp. 1–247, 2013

work page 2013
[66]

Mutual Information as a Function of Matrix SNR for Linear Gaussian Channels,

G. Reeves, H. D. Pfister, and A. Dytso, “Mutual Information as a Function of Matrix SNR for Linear Gaussian Channels,” in2018 IEEE International Symposium on Information Theory (ISIT), 2018, pp. 1754–1758

work page 2018
[67]

An MMSE Approach to the Secrecy Capacity of the MIMO Gaussian Wiretap Channel,

R. Bustin, R. Liu, H. V . Poor, and S. Shamai (Shitz), “An MMSE Approach to the Secrecy Capacity of the MIMO Gaussian Wiretap Channel,”EURASIP Journal on Wireless Communications and Networking, vol. 2009, no. 1, p. 370970, 2009

work page 2009
[68]

Optimum Power Allocation for Parallel Gaussian Channels with Arbitrary Input Distributions,

A. Lozano, A. M. Tulino, and S. Verd ´u, “Optimum Power Allocation for Parallel Gaussian Channels with Arbitrary Input Distributions,” IEEE Transactions on Information Theory, vol. 52, no. 7, pp. 3033–3051, 2006

work page 2006
[69]

The Rate-Distortion Dimension of Sets and Measures,

T. Kawabata and A. Dembo, “The Rate-Distortion Dimension of Sets and Measures,”IEEE Transactions on Information Theory, vol. 40, no. 5, pp. 1564–1572, 1994

work page 1994
[70]

The Capacity Achieving Distribution for the Amplitude Constrained Additive Gaussian Channel: An Upper Bound on the Number of Mass Points,

A. Dytso, S. Yagli, H. V . Poor, and S. Shamai (Shitz), “The Capacity Achieving Distribution for the Amplitude Constrained Additive Gaussian Channel: An Upper Bound on the Number of Mass Points,”IEEE Transactions on Information Theory, vol. 66, no. 4, pp. 2006–2022, 2019

work page 2006
[71]

Degrees of freedom in vector interference channels,

D. Stotz and H. B ¨olcskei, “Degrees of freedom in vector interference channels,”IEEE Transactions on Information Theory, vol. 62, no. 7, pp. 4172–4197, 2016

work page 2016
[72]

Mutual Information and Conditional Mean Estimation in Poisson Channels,

D. Guo, S. Shamai (Shitz), and S. Verd ´u, “Mutual Information and Conditional Mean Estimation in Poisson Channels,”IEEE Transactions on Information Theory, vol. 54, no. 5, pp. 1837–1849, 2008

work page 2008
[73]

Mutual Information, Relative Entropy, and Estimation in the Poisson Channel,

R. Atar and T. Weissman, “Mutual Information, Relative Entropy, and Estimation in the Poisson Channel,”IEEE Transactions on Information Theory, vol. 58, no. 3, pp. 1302–1318, 2012

work page 2012
[74]

Relations Between Information and Estimation in Discrete-Time L ´evy Channels,

J. Jiao, K. Venkat, and T. Weissman, “Relations Between Information and Estimation in Discrete-Time L ´evy Channels,”IEEE Transactions on Information Theory, vol. 63, no. 6, pp. 3579–3594, 2017

work page 2017
[75]

R. M. Dudley,Real Analysis and Probability. Chapman and Hall/CRC, 2018. April 14, 2026 DRAFT 50

work page 2018
[76]

G. B. Folland,Real Analysis: Modern Techniques and Their Applications. John Wiley & Sons, 1999. April 14, 2026 DRAFT

work page 1999

[1] [1]

On Measures of Entropy and Information,

A. R ´enyi, “On Measures of Entropy and Information,” inProceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. University of California Press, 1961, pp. 547–562

work page 1961

[2] [2]

Cumulant Generating Function of Codeword Lengths in Optimal Lossless Compression,

T. A. Courtade and S. Verd ´u, “Cumulant Generating Function of Codeword Lengths in Optimal Lossless Compression,” in2014 IEEE International Symposium on Information Theory, 2014, pp. 2494–2498

work page 2014

[3] [3]

R ´enyi’s Entropy and the Probability of Error,

M. Ben-Bassat and J. Raviv, “R ´enyi’s Entropy and the Probability of Error,”IEEE Transactions on Information Theory, vol. 24, no. 3, pp. 324–331, 2003

work page 2003

[4] [4]

Arimoto-R´enyi Conditional Entropy and Bayesianm-ary Hypothesis Testing,

I. Sason and S. Verd ´u, “Arimoto-R´enyi Conditional Entropy and Bayesianm-ary Hypothesis Testing,”IEEE Transactions on Information theory, vol. 64, no. 1, pp. 4–25, 2017

work page 2017

[5] [5]

An Inequality on Guessing and its Application to Sequential Decoding,

E. Arikan, “An Inequality on Guessing and its Application to Sequential Decoding,”IEEE Transactions on Information Theory, vol. 42, no. 1, pp. 99–105, 2002

work page 2002

[6] [6]

A Primer on Alpha-Information Theory with Application to Leakage in Secrecy Systems,

O. Rioul, “A Primer on Alpha-Information Theory with Application to Leakage in Secrecy Systems,” inInternational Conference on Geometric Science of Information. Springer, 2021, pp. 459–467

work page 2021

[7] [7]

Information Measures and Capacity of Orderαfor Discrete Memoryless Channels,

S. Arimoto, “Information Measures and Capacity of Orderαfor Discrete Memoryless Channels,”Topics in Information Theory, 1977

work page 1977

[8] [8]

Information Radius,

R. Sibson, “Information Radius,”Zeitschrift f ¨ur Wahrscheinlichkeitstheorie und verwandte Gebiete, vol. 14, no. 2, pp. 149–160, 1969

work page 1969

[9] [9]

Generalized Cutoff Rates and R ´enyi’s Information Measures,

I. Csisz ´ar, “Generalized Cutoff Rates and R ´enyi’s Information Measures,”IEEE Transactions on information theory, vol. 41, no. 1, pp. 26–34, 2002

work page 2002

[10] [10]

Noisy Channels,

U. Augustin, “Noisy Channels,” Ph.D. dissertation, Universit ¨at Erlangen–N ¨urnberg, 1978, Habilitation Thesis

work page 1978

[11] [11]

Two Measures of Dependence,

A. Lapidoth and C. Pfister, “Two Measures of Dependence,”Entropy, vol. 21, no. 8, 2019

work page 2019

[12] [12]

Sibsonα-Mutual Information and Its Variational Representations,

A. R. Esposito, M. Gastpar, and I. Issa, “Sibsonα-Mutual Information and Its Variational Representations,”IEEE Transactions on Information Theory, pp. 1–1, 2025

work page 2025

[13] [13]

Arimoto Channel Coding Converse and R ´enyi Divergence,

Y . Polyanskiy and S. Verd ´u, “Arimoto Channel Coding Converse and R ´enyi Divergence,” in2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2010, pp. 1327–1333

work page 2010

[14] [14]

Generalization Error Bounds via R ´enyi-, f-Divergences and Maximal Leakage,

A. R. Esposito, M. Gastpar, and I. Issa, “Generalization Error Bounds via R ´enyi-, f-Divergences and Maximal Leakage,”IEEE Transactions on Information Theory, vol. 67, no. 8, pp. 4986–5004, 2021

work page 2021

[15] [15]

On Meta-Bound for Lower Bounds of Bayes Risk,

S. Saito, “On Meta-Bound for Lower Bounds of Bayes Risk,” in2022 IEEE International Symposium on Information Theory (ISIT), 2022, pp. 3162–3167

work page 2022

[16] [16]

Mutual Information and Minimum Mean-Square Error in Gaussian Channels,

D. Guo, S. Shamai (Shitz), and S. Verd ´u, “Mutual Information and Minimum Mean-Square Error in Gaussian Channels,”IEEE Transactions on Information Theory, vol. 51, no. 4, pp. 1261–1282, 2005

work page 2005

[17] [17]

Derivative of Mutual Information at Zero SNR: The Gaussian-Noise Case,

Y . Wu, D. Guo, and S. Verd ´u, “Derivative of Mutual Information at Zero SNR: The Gaussian-Noise Case,”IEEE Transactions on Information Theory, vol. 57, no. 11, pp. 7307–7312, 2011

work page 2011

[18] [18]

α-Mutual Information,

S. Verd ´u, “α-Mutual Information,” in2015 Information Theory and Applications Workshop (ITA), 2015, pp. 1–6

work page 2015

[19] [19]

The Zero Error Capacity of a Noisy Channel,

C. Shannon, “The Zero Error Capacity of a Noisy Channel,”IRE Transactions on Information Theory, vol. 2, no. 3, pp. 8–19, 1956

work page 1956

[20] [20]

A Simple Derivation of the Coding Theorem and Some Applications,

R. Gallager, “A Simple Derivation of the Coding Theorem and Some Applications,”IEEE Transactions on Information Theory, vol. 11, no. 1, pp. 3–18, 1965

work page 1965

[21] [21]

Variable-Length Lossy Compression and Channel Coding: Non-Asymptotic Converses via Cumulant Generating Functions,

T. A. Courtade and S. Verd ´u, “Variable-Length Lossy Compression and Channel Coding: Non-Asymptotic Converses via Cumulant Generating Functions,” in2014 IEEE International Symposium on Information Theory, 2014, pp. 2499–2503

work page 2014

[22] [22]

On the Converse to the Coding Theorem for Discrete Memoryless Channels (corresp.),

S. Arimoto, “On the Converse to the Coding Theorem for Discrete Memoryless Channels (corresp.),”IEEE Transactions on Information Theory, vol. 19, no. 3, pp. 357–359, 1973

work page 1973

[23] [23]

Error Exponents andα-Mutual Information,

S. Verd ´u, “Error Exponents andα-Mutual Information,”Entropy, vol. 23, no. 2, p. 199, 2021

work page 2021

[24] [24]

Exact Exponent for Soft Covering,

S. Yagli and P. Cuff, “Exact Exponent for Soft Covering,”IEEE Transactions on Information Theory, vol. 65, no. 10, pp. 6234–6262, 2019. April 14, 2026 DRAFT 48

work page 2019

[25] [25]

An Operational Approach to Information Leakage,

I. Issa, A. B. Wagner, and S. Kamath, “An Operational Approach to Information Leakage,”IEEE Transactions on Information Theory, vol. 66, no. 3, pp. 1625–1657, 2019

work page 2019

[26] [26]

Tunable Measures for Information Leakage and Applications to Privacy-Utility Tradeoffs,

J. Liao, O. Kosut, L. Sankar, and F. du Pin Calmon, “Tunable Measures for Information Leakage and Applications to Privacy-Utility Tradeoffs,”IEEE Transactions on Information Theory, vol. 65, no. 12, pp. 8043–8066, 2019

work page 2019

[27] [27]

Information-Theoretic Lower Bounds on Bayes Risk in Decentralized Estimation,

A. Xu and M. Raginsky, “Information-Theoretic Lower Bounds on Bayes Risk in Decentralized Estimation,”IEEE Transactions on Information Theory, vol. 63, no. 3, pp. 1580–1600, 2016

work page 2016

[28] [28]

Lower-Bounds on the Bayesian Risk in Estimation Procedures via Sibson’sα-Mutual Information,

A. R. Esposito and M. Gastpar, “Lower-Bounds on the Bayesian Risk in Estimation Procedures via Sibson’sα-Mutual Information,” in 2021 IEEE International Symposium on Information Theory (ISIT), 2021, pp. 748–753

work page 2021

[29] [29]

Alpha-NML Universal Predictors,

M. Bondaschi and M. Gastpar, “Alpha-NML Universal Predictors,”IEEE Transactions on Information Theory, vol. 71, no. 2, pp. 1171–1183, 2025

work page 2025

[30] [30]

Convexity/Concavity of R´enyi Entropy andα-Mutual Information,

S.-W. Ho and S. Verd ´u’, “Convexity/Concavity of R´enyi Entropy andα-Mutual Information,” in2015 IEEE International Symposium on Information Theory (ISIT), 2015, pp. 745–749

work page 2015

[31] [31]

Alternating Optimization Approach for Computingα-Mutual Information andα-Capacity,

A. Kamatsuka, K. Kazama, and T. Yoshida, “Alternating Optimization Approach for Computingα-Mutual Information andα-Capacity,” in2025 IEEE International Symposium on Information Theory (ISIT), 2025, pp. 1–6

work page 2025

[32] [32]

Conditional R´enyi Divergence Saddlepoint and the Maximization ofα-Mutual Information,

C. Cai and S. Verd ´u, “Conditional R´enyi Divergence Saddlepoint and the Maximization ofα-Mutual Information,”Entropy, vol. 21, no. 10, p. 969, 2019

work page 2019

[33] [33]

Functional Properties of Minimum Mean-Square Error and Mutual Information,

Y . Wu and S. Verd ´u, “Functional Properties of Minimum Mean-Square Error and Mutual Information,”IEEE Transactions on Information Theory, vol. 58, no. 3, pp. 1289–1301, 2012

work page 2012

[34] [34]

Estimation in Gaussian Noise: Properties of the Minimum Mean-Square Error,

D. Guo, Y . Wu, S. Shamai (Shitz), and S. Verd ´u, “Estimation in Gaussian Noise: Properties of the Minimum Mean-Square Error,”IEEE Transactions on Information Theory, vol. 57, no. 4, pp. 2371–2385, 2011

work page 2011

[35] [35]

The Interplay between Information and Estimation Measures,

D. Guo, S. Shamai (Shitz), and S. Verd ´u, “The Interplay between Information and Estimation Measures,”Foundations and Trends in Signal Processing, vol. 6, no. 4, pp. 243–429, 2012

work page 2012

[36] [36]

A View of Information-Estimation Relations in Gaussian Networks,

A. Dytso, R. Bustin, H. V . Poor, and S. Shamai (Shitz), “A View of Information-Estimation Relations in Gaussian Networks,”Entropy, vol. 19, no. 8, p. 409, 2017

work page 2017

[37] [37]

On Classical Analogues of Free Entropy Dimension,

A. Guionnet and D. Shlyakhtenko, “On Classical Analogues of Free Entropy Dimension,”Journal of Functional Analysis, vol. 251, no. 2, pp. 738–771, 2007

work page 2007

[38] [38]

MMSE Dimension,

Y . Wu and S. Verd ´u, “MMSE Dimension,”IEEE Transactions on Information Theory, vol. 57, no. 8, pp. 4857–4879, 2011

work page 2011

[39] [39]

Entropic Isoperimetric and Cram ´er–Rao Inequalities for R ´enyi–Fisher Information,

H. Wu and L. Yu, “Entropic Isoperimetric and Cram ´er–Rao Inequalities for R ´enyi–Fisher Information,”IEEE Transactions on Information Theory, 2025

work page 2025

[40] [40]

On R ´enyi Entropy Power Inequalities,

E. Ram and I. Sason, “On R ´enyi Entropy Power Inequalities,”IEEE Transactions on Information Theory, vol. 62, no. 12, pp. 6800–6815, 2016

work page 2016

[41] [41]

R ´enyi Divergence and Kullback-Leibler Divergence,

T. Van Erven and P. Harremos, “R ´enyi Divergence and Kullback-Leibler Divergence,”IEEE Transactions on Information Theory, vol. 60, no. 7, pp. 3797–3820, 2014

work page 2014

[42] [42]

f-divergence inequalities,

I. Sason and S. Verd ´u, “f-divergence inequalities,”IEEE Transactions on Information Theory, vol. 62, no. 11, pp. 5973–6006, 2016

work page 2016

[43] [43]

R ´enyi,Probability Theory

A. R ´enyi,Probability Theory. Mineola, N.Y .: Dover Publications, 2007, unabridged republication of the work published by North-Holland Publishing Company, Amsterdam, 1970

work page 2007

[44] [44]

On the Dimension and Entropy of Orderαof the Mixture of Probability Distributions,

I. Csisz ´ar, “On the Dimension and Entropy of Orderαof the Mixture of Probability Distributions,”Acta Mathematica Hungarica, vol. 13, no. 3-4, pp. 245–255, 1962

work page 1962

[45] [45]

R´enyi Information Dimension: Fundamental Limits of Almost Lossless Analog Compression,

Y . Wu and S. Verd ´u, “R´enyi Information Dimension: Fundamental Limits of Almost Lossless Analog Compression,”IEEE Transactions on Information Theory, vol. 56, no. 8, pp. 3721–3748, 2010

work page 2010

[46] [46]

Information Dimension and the Degrees of Freedom of the Interference Channel,

Y . Wu, S. Shamai (Shitz), and S. Verd ´u, “Information Dimension and the Degrees of Freedom of the Interference Channel,”IEEE Transactions on Information Theory, vol. 61, no. 1, pp. 256–279, 2015

work page 2015

[47] [47]

The Information Capacity of Amplitude-and Variance-Constrained Scalar Gaussian Channels,

J. G. Smith, “The Information Capacity of Amplitude-and Variance-Constrained Scalar Gaussian Channels,”Information and Control, vol. 18, no. 3, pp. 203–219, 1971

work page 1971

[48] [48]

When Are Discrete Channel Inputs Optimal?—Optimization Techniques and Some New Results,

A. Dytso, M. Goldenbaum, H. V . Poor, and S. Shamai (Shitz), “When Are Discrete Channel Inputs Optimal?—Optimization Techniques and Some New Results,” in2018 52nd Annual Conference on Information Sciences and Systems (CISS), 2018, pp. 1–6

work page 2018

[49] [49]

Capacity-Achieving Input Distributions of Additive Vector Gaussian Noise Channels: Even- Moment Constraints and Unbounded or Compact Support,

J. Eisen, R. R. Mazumdar, and P. Mitran, “Capacity-Achieving Input Distributions of Additive Vector Gaussian Noise Channels: Even- Moment Constraints and Unbounded or Compact Support,”Entropy, vol. 25, no. 8, p. 1180, 2023. April 14, 2026 DRAFT 49

work page 2023

[50] [50]

M. S. Pinsker,Information and Information Stability of Random Variables and Processes, ser. Holden-Day series in time series analysis. San Francisco: Holden-Day, Inc., 1964

work page 1964

[51] [51]

R. B. Ash,Information Theory. Dover Publications, 1990, originally published in 1965

work page 1990

[52] [52]

An Empirical Bayes Approach to Statistics,

H. Robbins, “An Empirical Bayes Approach to Statistics,” inProceedings Third Berkeley Symposium on Mathematical Statistics and Probabily. Citeseer, 1956

work page 1956

[53] [53]

Some Geometric Properties of the Likelihood Ratio (corresp.),

C. Hatsell and L. Nolte, “Some Geometric Properties of the Likelihood Ratio (corresp.),”IEEE Transactions on Information Theory, vol. 17, no. 5, pp. 616–618, 1971

work page 1971

[54] [54]

Conditional Mean Estimation in Gaussian Noise: A Meta Derivative Identity with Applications,

A. Dytso, H. V . Poor, and S. Shamai (Shitz), “Conditional Mean Estimation in Gaussian Noise: A Meta Derivative Identity with Applications,”IEEE Transactions on Information Theory, vol. 69, no. 3, pp. 1883–1898, 2022

work page 2022

[55] [55]

Admissible Estimators, Recurrent Diffusions, and Insoluble Boundary Value Problems,

L. D. Brown, “Admissible Estimators, Recurrent Diffusions, and Insoluble Boundary Value Problems,”The Annals of Mathematical Statistics, vol. 42, no. 3, pp. 855–903, 1971

work page 1971

[56] [56]

Institute of Mathematical Statistics Lecture Notes—Monograph Series

——,Fundamentals of Statistical Exponential Families: With Applications in Statistical Decision Theory, ser. Institute of Mathematical Statistics Lecture Notes—Monograph Series. Hayward, CA: Institute of Mathematical Statistics, 1986, vol. 9

work page 1986

[57] [57]

Some Inequalities Satisfied by the Quantities of Information of Fisher and Shannon,

A. J. Stam, “Some Inequalities Satisfied by the Quantities of Information of Fisher and Shannon,”Information and Control, vol. 2, no. 2, pp. 101–112, 1959

work page 1959

[58] [58]

On Channel Capacity per Unit Cost,

S. Verd ´u, “On Channel Capacity per Unit Cost,”IEEE Transactions on Information Theory, vol. 36, no. 5, pp. 1019–1030, 1990

work page 1990

[59] [59]

Fading Channels: How Perfect Need “Perfect Side Information

A. Lapidoth and S. Shamai (Shitz), “Fading Channels: How Perfect Need “Perfect Side Information” Be?”IEEE Transactions on Information Theory, vol. 48, no. 5, pp. 1118–1134, 2002

work page 2002

[60] [60]

Spectral Efficiency in the Wideband Regime,

S. Verd ´u, “Spectral Efficiency in the Wideband Regime,”IEEE Transactions on Information Theory, vol. 48, no. 6, pp. 1319–1343, 2002

work page 2002

[61] [61]

A Simple Proof of the Entropy-Power Inequality,

D. Guo, “A Simple Proof of the Entropy-Power Inequality,”IEEE Transactions on Information Theory, vol. 52, no. 5, pp. 2165–2166, 2006

work page 2006

[62] [62]

Monotonic Decrease of the non-Gaussianness of the Sum of Independent Random Variables: A Simple Proof,

A. M. Tulino and S. Verd ´u, “Monotonic Decrease of the non-Gaussianness of the Sum of Independent Random Variables: A Simple Proof,”IEEE Transactions on Information Theory, vol. 52, no. 9, pp. 4295–4297, 2006

work page 2006

[63] [63]

Proof of Entropy Power Inequalities via MMSE,

D. Guo, S. Shamai (Shitz), and S. Verd ´u, “Proof of Entropy Power Inequalities via MMSE,” in2006 IEEE International Symposium on Information Theory, 2006, pp. 1011–1015

work page 2006

[64] [64]

R ´enyi Entropy Dimension of the Mixture of Measures,

M. ´Smieja and J. Tabor, “R ´enyi Entropy Dimension of the Mixture of Measures,” in2014 Science and Information Conference, 2014, pp. 685–689

work page 2014

[65] [65]

Concentration of Measure Inequalities in Information Theory, Communications, and Coding,

M. Raginsky and I. Sason, “Concentration of Measure Inequalities in Information Theory, Communications, and Coding,”Foundations and Trends in Communications and Information Theory, vol. 10, no. 1-2, pp. 1–247, 2013

work page 2013

[66] [66]

Mutual Information as a Function of Matrix SNR for Linear Gaussian Channels,

G. Reeves, H. D. Pfister, and A. Dytso, “Mutual Information as a Function of Matrix SNR for Linear Gaussian Channels,” in2018 IEEE International Symposium on Information Theory (ISIT), 2018, pp. 1754–1758

work page 2018

[67] [67]

An MMSE Approach to the Secrecy Capacity of the MIMO Gaussian Wiretap Channel,

R. Bustin, R. Liu, H. V . Poor, and S. Shamai (Shitz), “An MMSE Approach to the Secrecy Capacity of the MIMO Gaussian Wiretap Channel,”EURASIP Journal on Wireless Communications and Networking, vol. 2009, no. 1, p. 370970, 2009

work page 2009

[68] [68]

Optimum Power Allocation for Parallel Gaussian Channels with Arbitrary Input Distributions,

A. Lozano, A. M. Tulino, and S. Verd ´u, “Optimum Power Allocation for Parallel Gaussian Channels with Arbitrary Input Distributions,” IEEE Transactions on Information Theory, vol. 52, no. 7, pp. 3033–3051, 2006

work page 2006

[69] [69]

The Rate-Distortion Dimension of Sets and Measures,

T. Kawabata and A. Dembo, “The Rate-Distortion Dimension of Sets and Measures,”IEEE Transactions on Information Theory, vol. 40, no. 5, pp. 1564–1572, 1994

work page 1994

[70] [70]

The Capacity Achieving Distribution for the Amplitude Constrained Additive Gaussian Channel: An Upper Bound on the Number of Mass Points,

A. Dytso, S. Yagli, H. V . Poor, and S. Shamai (Shitz), “The Capacity Achieving Distribution for the Amplitude Constrained Additive Gaussian Channel: An Upper Bound on the Number of Mass Points,”IEEE Transactions on Information Theory, vol. 66, no. 4, pp. 2006–2022, 2019

work page 2006

[71] [71]

Degrees of freedom in vector interference channels,

D. Stotz and H. B ¨olcskei, “Degrees of freedom in vector interference channels,”IEEE Transactions on Information Theory, vol. 62, no. 7, pp. 4172–4197, 2016

work page 2016

[72] [72]

Mutual Information and Conditional Mean Estimation in Poisson Channels,

D. Guo, S. Shamai (Shitz), and S. Verd ´u, “Mutual Information and Conditional Mean Estimation in Poisson Channels,”IEEE Transactions on Information Theory, vol. 54, no. 5, pp. 1837–1849, 2008

work page 2008

[73] [73]

Mutual Information, Relative Entropy, and Estimation in the Poisson Channel,

R. Atar and T. Weissman, “Mutual Information, Relative Entropy, and Estimation in the Poisson Channel,”IEEE Transactions on Information Theory, vol. 58, no. 3, pp. 1302–1318, 2012

work page 2012

[74] [74]

Relations Between Information and Estimation in Discrete-Time L ´evy Channels,

J. Jiao, K. Venkat, and T. Weissman, “Relations Between Information and Estimation in Discrete-Time L ´evy Channels,”IEEE Transactions on Information Theory, vol. 63, no. 6, pp. 3579–3594, 2017

work page 2017

[75] [75]

R. M. Dudley,Real Analysis and Probability. Chapman and Hall/CRC, 2018. April 14, 2026 DRAFT 50

work page 2018

[76] [76]

G. B. Folland,Real Analysis: Modern Techniques and Their Applications. John Wiley & Sons, 1999. April 14, 2026 DRAFT

work page 1999