Rate-distortion Theory with Lower Semi-continuous Distortion on Noncompact Alphabets
Pith reviewed 2026-05-16 15:43 UTC · model grok-4.3
The pith
Rate-distortion infima are attained for lower semi-continuous distortions on locally compact Polish alphabets.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
When the reproduction alphabet is a locally compact Polish space and the distortion function is lower semi-continuous, the rate-distortion infimum is attained. Bounded distortions yield attainment through the one-point compactification argument. Unbounded coercive distortions yield attainment through a concentration-compactness argument. Several counterexamples confirm that these attainability conditions are nearly necessary.
What carries the argument
Lower semi-continuity of the distortion function on a locally compact Polish reproduction alphabet, which permits one-point compactification for bounded cases or concentration-compactness for coercive unbounded cases to produce minimizing reconstruction distributions.
If this is right
- The rate-distortion function is achieved by some reconstruction distribution whenever the stated conditions hold.
- Optimal rate-distortion performance is attainable without requiring the reproduction alphabet to be compact.
- Existence extends to common noncompact alphabets such as the real line under bounded or coercive distortions.
- The two mechanisms together give a unified existence theorem for lower semi-continuous distortions.
Where Pith is reading between the lines
- The results justify direct rate-distortion analysis of continuous-valued sources without preliminary discretization.
- Similar existence arguments may apply to related quantities such as channel capacity under matching distortion constraints.
- Numerical optimization procedures for finding optimal distributions gain theoretical support in these noncompact settings.
Load-bearing premise
The reproduction alphabet is a locally compact Polish space.
What would settle it
An explicit locally compact Polish alphabet together with a lower semi-continuous distortion for which no minimizing reconstruction distribution exists would falsify the attainability claim.
read the original abstract
In this paper, we study rate-distortion theory for general sources with an emphasis on the existence of optimal reconstruction distributions on noncompact alphabets. Classical attainability results typically rely on compactness of the reproduction alphabet together with continuity of the distortion function, which may fail in many noncompact settings. We identify two complementary existence mechanisms under lower semi-continuity on locally compact Polish alphabets. For bounded distortions, we prove that the rate-distortion infimum is attained via the one-point compactification argument. For unbounded coercive distortions, we establish existence via concentration-compactness. We also give several counterexamples showing that our attainability results are close to sharp. Our results provide a unified and transparent existence theorem for rate-distortion problems with lower semi-continuous distortions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper establishes existence of optimal reproduction distributions attaining the rate-distortion infimum for general sources when the reproduction alphabet is a locally compact Polish space and the distortion measure is lower semi-continuous. It provides two complementary mechanisms: one-point compactification for bounded distortions, and concentration-compactness for unbounded coercive distortions. Several counterexamples are given to show that the attainability results are close to sharp, yielding a unified existence theorem beyond the classical compact-alphabet setting.
Significance. If the central claims hold, the results meaningfully extend classical rate-distortion attainability theorems (which rely on compactness and continuity) to noncompact alphabets under only lower semi-continuity. The explicit proof strategies, use of standard topological tools, and sharpness counterexamples constitute a solid contribution to information theory on general alphabets, with potential applicability to source coding problems involving unbounded or noncompact reproduction spaces.
major comments (2)
- [unbounded coercive distortions section] Unbounded coercive case (concentration-compactness argument): pointwise coercivity (lim_{y→∞} d(x,y)=∞ for each fixed x) does not automatically imply tightness of minimizing sequences {Y_n} when the source μ itself has escaping mass. Without uniformity in x, Markov-type bounds on E_μ[d(X,Y_n)] may fail to control mass escape in Y, so weak convergence in P(Y) is not guaranteed and the attainment step remains open. This directly affects the central existence claim for the unbounded case.
- [bounded distortions section] The one-point compactification argument for bounded distortions is less exposed but should still verify that the extended distortion remains lower semi-continuous at the point at infinity and that the infimum is preserved under the compactification.
minor comments (2)
- [introduction] Notation for the rate-distortion functional and the sets of admissible reproduction measures should be introduced with explicit definitions early in the paper to improve readability.
- [counterexamples section] A few counterexamples could benefit from explicit numerical values or simple concrete alphabets (e.g., R with quadratic distortion) to make the sharpness claims more transparent.
Simulated Author's Rebuttal
We are grateful to the referee for the insightful comments and the recommendation for major revision. We address each major comment below and will make the necessary revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [unbounded coercive distortions section] Unbounded coercive case (concentration-compactness argument): pointwise coercivity (lim_{y→∞} d(x,y)=∞ for each fixed x) does not automatically imply tightness of minimizing sequences {Y_n} when the source μ itself has escaping mass. Without uniformity in x, Markov-type bounds on E_μ[d(X,Y_n)] may fail to control mass escape in Y, so weak convergence in P(Y) is not guaranteed and the attainment step remains open. This directly affects the central existence claim for the unbounded case.
Authors: We thank the referee for highlighting this potential gap. We agree that pointwise coercivity requires additional care to ensure tightness of the minimizing sequences. In the revised manuscript, we will strengthen the argument by first using the tightness of the source measure μ (guaranteed since the alphabet is Polish) to truncate the source to a compact set with high probability. On this compact set, the lower semi-continuity of the distortion allows us to establish effective uniformity of the coercivity for the purpose of applying Markov's inequality to control the escape of mass in Y_n. We will include this detailed step in the concentration-compactness proof. This addresses the concern and confirms the attainment result under the stated conditions. revision: yes
-
Referee: [bounded distortions section] The one-point compactification argument for bounded distortions is less exposed but should still verify that the extended distortion remains lower semi-continuous at the point at infinity and that the infimum is preserved under the compactification.
Authors: We thank the referee for this constructive comment. We will revise the bounded distortions section to explicitly verify that the one-point compactification preserves the lower semi-continuity of the distortion measure at the added point at infinity. Specifically, we will show that the extended distortion function is lower semi-continuous on the compactified space. Furthermore, we will demonstrate that the rate-distortion infimum remains unchanged under this compactification, as measures assigning positive mass to the infinity point can be approximated by sequences of measures supported on the original alphabet without increasing the mutual information or the expected distortion. These verifications will be added to the manuscript. revision: yes
Circularity Check
No circularity: standard topological arguments for existence
full rationale
The paper derives existence of rate-distortion minimizers via one-point compactification (bounded case) and concentration-compactness (unbounded coercive case) on locally compact Polish spaces under lower semi-continuity. These steps invoke classical results from topology and measure theory without any self-definitional reductions, fitted parameters renamed as predictions, or load-bearing self-citations. All claims rest on external, independently verifiable properties of the spaces and distortion functions rather than reducing to the paper's own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The reproduction alphabet is a locally compact Polish space.
- domain assumption The distortion function is lower semi-continuous.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Generalized Concentration-compactness Lemma I ... Compactness / Vanishing / Dichotomy alternatives
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
A mathematical theory of communication,
C. E. Shannon, “A mathematical theory of communication,”The Bell System Technical Journal, vol. 27, no. 3, pp. 379–423, 1948
work page 1948
-
[2]
Coding theorems for a discrete source with a fidelity criterion,
——, “Coding theorems for a discrete source with a fidelity criterion,” International Convention Record, vol. 7, pp. 325–350, 1959
work page 1959
-
[3]
Berger,Rate-Distortion Theory: A Mathematical Basis for Data Compression
T. Berger,Rate-Distortion Theory: A Mathematical Basis for Data Compression. Englewood Cliffs: Prentice-Hall, 1971
work page 1971
-
[4]
Computation of channel capacity and rate-distortion func- tions,
R. Blahut, “Computation of channel capacity and rate-distortion func- tions,”IEEE Trans. Inf. Theory, vol. 18, no. 4, pp. 460–473, 1972
work page 1972
-
[5]
R. E. Blahut,Principles and Practice of Information Theory. Addison- Wesley Longman Publishing Co., Inc., 1987
work page 1987
-
[6]
T. M. Cover and J. A. Thomas,Elements of Information Theory, 2nd ed. New York, NY , USA: Wiley, 2006
work page 2006
-
[7]
On an extremum problem of information theory,
I. Csiszár, “On an extremum problem of information theory,”Studia Scientiarum Mathematicarum Hungarica, vol. 9, no. 1, pp. 57–71, 1974
work page 1974
-
[8]
An algorithm for computing the capacity of arbitrary discrete memoryless channels,
S. Arimoto, “An algorithm for computing the capacity of arbitrary discrete memoryless channels,”IEEE Trans. Inf. Theory, vol. 18, no. 1, pp. 14–20, 1972
work page 1972
-
[9]
Rate-distortion theory for general sets and measures,
E. Riegler, H. Bölcskei, and G. Koliander, “Rate-distortion theory for general sets and measures,” inProc. lEEE Int. Symp. Inf. Theory (ISIT), Vail, CO, USA, 2018, pp. 101–105
work page 2018
-
[10]
Lossy compression of general random variables,
E. Riegler, G. Koliander, and H. Bölcskei, “Lossy compression of general random variables,”Information and Inference: A Journal of the IMA, vol. 12, no. 3, pp. 1759–1829, 2023
work page 2023
-
[11]
The rate-distortion dimension of sets and measures,
T. Kawabata and A. Dembo, “The rate-distortion dimension of sets and measures,”IEEE Trans. Inf. Theory, vol. 40, no. 5, pp. 1564–1572, 1994
work page 1994
-
[12]
Successive refinement of abstract sources,
V . Kostina and E. Tuncel, “Successive refinement of abstract sources,” IEEE Trans. Inf. Theory, vol. 65, no. 10, pp. 6385–6398, 2019
work page 2019
-
[13]
Asymptotic evaluation of certain Markov process expectations for large time, I,
M. D. Donsker and S. S. Varadhan, “Asymptotic evaluation of certain Markov process expectations for large time, I,”Commun. Pure Appl. Math., vol. 28, no. 1, pp. 1–47, 1975
work page 1975
-
[14]
Successive refinement of information,
W. H. Equitz and T. M. Cover, “Successive refinement of information,” IEEE Trans. Inf. Theory, vol. 37, no. 2, pp. 269–275, 1991
work page 1991
-
[15]
Kantorovich duality for general transport costs and applications,
N. Gozlan, C. Roberto, P.-M. Samson, and P. Tetali, “Kantorovich duality for general transport costs and applications,”J. Funct. Anal., vol. 273, no. 11, pp. 3327–3405, 2017
work page 2017
-
[16]
A revisit to rate-distortion theory via optimal weak transport,
J. Zou, L. Fan, J. Gao, and J. Wang, “A revisit to rate-distortion theory via optimal weak transport,”arXiv:2501.09362, 2025
-
[17]
Rate distortion theory for general sources with potential application to image compression,
F. Rezaei, N. Ahmed, and C. D. Charalambous, “Rate distortion theory for general sources with potential application to image compression,” International Journal of Applied Mathematical Sciences, vol. 3, no. 2, pp. 141–165, 2006
work page 2006
-
[18]
Y . Polyanskiy and Y . Wu,Information Theory: From Coding to Learn- ing. Cambridge University Press, 2025
work page 2025
-
[19]
P.-L. Lions, “The concentration-compactness principle in the calculus of variations. The locally compact case, part 1,”Annales de l’Institut Henri Poincaré C, Analyse non linéaire, vol. 1, no. 2, pp. 109–145, 1984
work page 1984
-
[20]
——, “The concentration-compactness principle in the calculus of variations. The locally compact case, part 2,”Annales de l’Institut Henri Poincaré C, Analyse non linéaire, vol. 1, no. 4, pp. 223–283, 1984
work page 1984
-
[21]
The concentration-compactness principle in the calculus of variations. The limit case, part 1,
——, “The concentration-compactness principle in the calculus of variations. The limit case, part 1,”Revista Matemática Iberoamericana, vol. 1, no. 1, pp. 145–201, 1985
work page 1985
-
[22]
The concentration-compactness principle in the calculus of variations. The limit case, part 2,
——, “The concentration-compactness principle in the calculus of variations. The limit case, part 2,”Revista Matemática Iberoamericana, vol. 1, no. 2, pp. 45–121, 1985
work page 1985
-
[23]
R. M. Gray,Entropy and Information Theory, 2nd ed. New York: Springer, 2011
work page 2011
-
[24]
De-noising by soft-thresholding,
D. L. Donoho, “De-noising by soft-thresholding,”IEEE Trans. Inf. Theory, vol. 41, no. 3, pp. 613–627, 1995
work page 1995
-
[25]
Discontinuous piecewise linear optimiza- tion,
A. R. Conn and M. Mongeau, “Discontinuous piecewise linear optimiza- tion,”Mathematical programming, vol. 80, no. 3, pp. 315–380, 1998
work page 1998
-
[26]
M. Mongeau, “Discontinuous optimization,” inEncyclopedia of Opti- mization. Springer, 2008, pp. 739–744
work page 2008
-
[27]
Discontinuous optimization in batch production using sumt,
I. Imo and D. Leech, “Discontinuous optimization in batch production using sumt,”The International Journal of Production Research, vol. 22, no. 2, pp. 313–321, 1984
work page 1984
-
[28]
A simplex algorithm for piecewise-linear programming i: Derivation and proof,
R. Fourer, “A simplex algorithm for piecewise-linear programming i: Derivation and proof,”Mathematical Programming, vol. 33, no. 2, pp. 204–233, 1985
work page 1985
-
[29]
Über systeme von liearren partiellen differentialgleichun- gen erster ordnung,
W.-L. Chow, “Über systeme von liearren partiellen differentialgleichun- gen erster ordnung,”Mathematische Annalen, vol. 117, no. 1, pp. 98– 105, 1940
work page 1940
-
[30]
About connecting two points of a completely nonholo- nomic space by admissible curve,
P. K. Rashevsky, “About connecting two points of a completely nonholo- nomic space by admissible curve,”Uch. Zapiski Ped. Inst. Libknechta, vol. 2, pp. 83–94, 1938
work page 1938
-
[31]
Montgomery,A Tour of Subriemannian Geometries, Their Geodesics and Applications
R. Montgomery,A Tour of Subriemannian Geometries, Their Geodesics and Applications. Providence, RI: American Mathematical Soc., 2002, vol. 91
work page 2002
-
[32]
Lecture notes on sub-Riemannian geometry,
E. Le Donne, “Lecture notes on sub-Riemannian geometry,” Centro di Ricerca Matematica Ennio de Giorgi, Scuola Normale Superiore, 2023, available online: https://cvgmt.sns.it/paper/5339/
work page 2023
-
[33]
M. P. Do Carmo and J. Flaherty Francis,Riemannian Geometry. Springer, 1992, vol. 2
work page 1992
-
[34]
Nonanticipative rate distortion function and relations to filtering theory,
C. D. Charalambous, P. A. Stavrou, and N. U. Ahmed, “Nonanticipative rate distortion function and relations to filtering theory,”IEEE Transac- tions on Automatic Control, vol. 59, no. 4, pp. 937–952, 2014
work page 2014
-
[35]
Directed information on abstract spaces: Properties and variational equalities,
C. D. Charalambous and P. A. Stavrou, “Directed information on abstract spaces: Properties and variational equalities,”IEEE Trans. Inf. Theory, vol. 62, no. 11, pp. 6019–6052, 2016
work page 2016
-
[36]
P. A. Stavrou, T. Charalambous, C. D. Charalambous, and S. Loyka, “Optimal estimation via nonanticipative rate distortion function and applications to time-varying gauss–markov processes,”SIAM Journal on Control and Optimization, vol. 56, no. 5, pp. 3731–3765, 2018
work page 2018
-
[37]
C. D. Charalambous, T. Charalambous, C. Kourtellaris, and J. H. van Schuppen, “Complete characterization of gorbunov and pinsker nonan- ticipatory epsilon entropy of multivariate gaussian sources: Structural properties,”IEEE Trans. Inf. Theory, vol. 68, no. 3, pp. 1440–1464, 2022
work page 2022
-
[38]
M. Struwe,Variational Methods: Applications to Nonlinear Partial Differential Equations and Hamiltonian Systems. Berlin, Heidelberg: Springer-Verlag, 2000, vol. 306. APPENDIXA PROOF OFTHEOREM6 The proof follows the classical strategy of [19], [20], [38], adapted to the abstract Polish-space setting. We begin by introducing the notion of the concentration ...
work page 2000
-
[39]
For eachm∈N, choosex m ∈ X satisfying Qm(R0)≤ Z BR0 (xm) dµm + 1 m ,lim m→∞ Qm(R0) =Q(R 0). Fix0< ϵ < 1
-
[40]
ChooseR >0such thatQ(R)>1−ϵ > 1
-
[41]
For eachm, lety m ∈ Xsatisfy Qm(R)≤ Z BR(ym) dµm + 1 m ,lim m→∞ Qm(R) =Q(R), where the choice ofy m may depend onϵ. ThenZ BR0 (xm) dµm + Z BR(ym) dµm ≥Q m(R0) +Q m(R)− 2 m →Q(R 0) +Q(R) > 1 2 + 1 2 = 1 = Z X dµm,(15) for all sufficiently largem. Hence the ballsB R0(xm)andB R(ym)must intersect. Otherwise, Z BR0 (xm) dµm + Z BR(ym) dµm = Z BR0 (xm)∪BR(ym) d...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.