pith. sign in

arxiv: 2601.07246 · v3 · submitted 2026-01-12 · 💻 cs.IT · math.IT

Rate-distortion Theory with Lower Semi-continuous Distortion on Noncompact Alphabets

Pith reviewed 2026-05-16 15:43 UTC · model grok-4.3

classification 💻 cs.IT math.IT
keywords rate-distortion theorylower semi-continuous distortionnoncompact alphabetsexistence of optimal distributionsPolish spacesone-point compactificationconcentration-compactness
0
0 comments X

The pith

Rate-distortion infima are attained for lower semi-continuous distortions on locally compact Polish alphabets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes existence of optimal reconstruction distributions for rate-distortion problems when the reproduction alphabet is noncompact. The authors prove two complementary mechanisms under lower semi-continuity: one-point compactification works for bounded distortions, while concentration-compactness handles unbounded coercive distortions. These results apply directly to alphabets such as the real line that classical compactness-based theorems cannot cover. The paper supplies counterexamples showing the conditions are close to sharp.

Core claim

When the reproduction alphabet is a locally compact Polish space and the distortion function is lower semi-continuous, the rate-distortion infimum is attained. Bounded distortions yield attainment through the one-point compactification argument. Unbounded coercive distortions yield attainment through a concentration-compactness argument. Several counterexamples confirm that these attainability conditions are nearly necessary.

What carries the argument

Lower semi-continuity of the distortion function on a locally compact Polish reproduction alphabet, which permits one-point compactification for bounded cases or concentration-compactness for coercive unbounded cases to produce minimizing reconstruction distributions.

If this is right

  • The rate-distortion function is achieved by some reconstruction distribution whenever the stated conditions hold.
  • Optimal rate-distortion performance is attainable without requiring the reproduction alphabet to be compact.
  • Existence extends to common noncompact alphabets such as the real line under bounded or coercive distortions.
  • The two mechanisms together give a unified existence theorem for lower semi-continuous distortions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The results justify direct rate-distortion analysis of continuous-valued sources without preliminary discretization.
  • Similar existence arguments may apply to related quantities such as channel capacity under matching distortion constraints.
  • Numerical optimization procedures for finding optimal distributions gain theoretical support in these noncompact settings.

Load-bearing premise

The reproduction alphabet is a locally compact Polish space.

What would settle it

An explicit locally compact Polish alphabet together with a lower semi-continuous distortion for which no minimizing reconstruction distribution exists would falsify the attainability claim.

read the original abstract

In this paper, we study rate-distortion theory for general sources with an emphasis on the existence of optimal reconstruction distributions on noncompact alphabets. Classical attainability results typically rely on compactness of the reproduction alphabet together with continuity of the distortion function, which may fail in many noncompact settings. We identify two complementary existence mechanisms under lower semi-continuity on locally compact Polish alphabets. For bounded distortions, we prove that the rate-distortion infimum is attained via the one-point compactification argument. For unbounded coercive distortions, we establish existence via concentration-compactness. We also give several counterexamples showing that our attainability results are close to sharp. Our results provide a unified and transparent existence theorem for rate-distortion problems with lower semi-continuous distortions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper establishes existence of optimal reproduction distributions attaining the rate-distortion infimum for general sources when the reproduction alphabet is a locally compact Polish space and the distortion measure is lower semi-continuous. It provides two complementary mechanisms: one-point compactification for bounded distortions, and concentration-compactness for unbounded coercive distortions. Several counterexamples are given to show that the attainability results are close to sharp, yielding a unified existence theorem beyond the classical compact-alphabet setting.

Significance. If the central claims hold, the results meaningfully extend classical rate-distortion attainability theorems (which rely on compactness and continuity) to noncompact alphabets under only lower semi-continuity. The explicit proof strategies, use of standard topological tools, and sharpness counterexamples constitute a solid contribution to information theory on general alphabets, with potential applicability to source coding problems involving unbounded or noncompact reproduction spaces.

major comments (2)
  1. [unbounded coercive distortions section] Unbounded coercive case (concentration-compactness argument): pointwise coercivity (lim_{y→∞} d(x,y)=∞ for each fixed x) does not automatically imply tightness of minimizing sequences {Y_n} when the source μ itself has escaping mass. Without uniformity in x, Markov-type bounds on E_μ[d(X,Y_n)] may fail to control mass escape in Y, so weak convergence in P(Y) is not guaranteed and the attainment step remains open. This directly affects the central existence claim for the unbounded case.
  2. [bounded distortions section] The one-point compactification argument for bounded distortions is less exposed but should still verify that the extended distortion remains lower semi-continuous at the point at infinity and that the infimum is preserved under the compactification.
minor comments (2)
  1. [introduction] Notation for the rate-distortion functional and the sets of admissible reproduction measures should be introduced with explicit definitions early in the paper to improve readability.
  2. [counterexamples section] A few counterexamples could benefit from explicit numerical values or simple concrete alphabets (e.g., R with quadratic distortion) to make the sharpness claims more transparent.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for the insightful comments and the recommendation for major revision. We address each major comment below and will make the necessary revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [unbounded coercive distortions section] Unbounded coercive case (concentration-compactness argument): pointwise coercivity (lim_{y→∞} d(x,y)=∞ for each fixed x) does not automatically imply tightness of minimizing sequences {Y_n} when the source μ itself has escaping mass. Without uniformity in x, Markov-type bounds on E_μ[d(X,Y_n)] may fail to control mass escape in Y, so weak convergence in P(Y) is not guaranteed and the attainment step remains open. This directly affects the central existence claim for the unbounded case.

    Authors: We thank the referee for highlighting this potential gap. We agree that pointwise coercivity requires additional care to ensure tightness of the minimizing sequences. In the revised manuscript, we will strengthen the argument by first using the tightness of the source measure μ (guaranteed since the alphabet is Polish) to truncate the source to a compact set with high probability. On this compact set, the lower semi-continuity of the distortion allows us to establish effective uniformity of the coercivity for the purpose of applying Markov's inequality to control the escape of mass in Y_n. We will include this detailed step in the concentration-compactness proof. This addresses the concern and confirms the attainment result under the stated conditions. revision: yes

  2. Referee: [bounded distortions section] The one-point compactification argument for bounded distortions is less exposed but should still verify that the extended distortion remains lower semi-continuous at the point at infinity and that the infimum is preserved under the compactification.

    Authors: We thank the referee for this constructive comment. We will revise the bounded distortions section to explicitly verify that the one-point compactification preserves the lower semi-continuity of the distortion measure at the added point at infinity. Specifically, we will show that the extended distortion function is lower semi-continuous on the compactified space. Furthermore, we will demonstrate that the rate-distortion infimum remains unchanged under this compactification, as measures assigning positive mass to the infinity point can be approximated by sequences of measures supported on the original alphabet without increasing the mutual information or the expected distortion. These verifications will be added to the manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: standard topological arguments for existence

full rationale

The paper derives existence of rate-distortion minimizers via one-point compactification (bounded case) and concentration-compactness (unbounded coercive case) on locally compact Polish spaces under lower semi-continuity. These steps invoke classical results from topology and measure theory without any self-definitional reductions, fitted parameters renamed as predictions, or load-bearing self-citations. All claims rest on external, independently verifiable properties of the spaces and distortion functions rather than reducing to the paper's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on two domain assumptions about the alphabet and distortion; no free parameters or invented entities are introduced.

axioms (2)
  • domain assumption The reproduction alphabet is a locally compact Polish space.
    Invoked to enable one-point compactification and concentration-compactness arguments on the space.
  • domain assumption The distortion function is lower semi-continuous.
    Key regularity condition used to guarantee attainment of the infimum in both cases.

pith-pipeline@v0.9.0 · 5432 in / 1200 out tokens · 52003 ms · 2026-05-16T15:43:46.765574+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages

  1. [1]

    A mathematical theory of communication,

    C. E. Shannon, “A mathematical theory of communication,”The Bell System Technical Journal, vol. 27, no. 3, pp. 379–423, 1948

  2. [2]

    Coding theorems for a discrete source with a fidelity criterion,

    ——, “Coding theorems for a discrete source with a fidelity criterion,” International Convention Record, vol. 7, pp. 325–350, 1959

  3. [3]

    Berger,Rate-Distortion Theory: A Mathematical Basis for Data Compression

    T. Berger,Rate-Distortion Theory: A Mathematical Basis for Data Compression. Englewood Cliffs: Prentice-Hall, 1971

  4. [4]

    Computation of channel capacity and rate-distortion func- tions,

    R. Blahut, “Computation of channel capacity and rate-distortion func- tions,”IEEE Trans. Inf. Theory, vol. 18, no. 4, pp. 460–473, 1972

  5. [5]

    R. E. Blahut,Principles and Practice of Information Theory. Addison- Wesley Longman Publishing Co., Inc., 1987

  6. [6]

    T. M. Cover and J. A. Thomas,Elements of Information Theory, 2nd ed. New York, NY , USA: Wiley, 2006

  7. [7]

    On an extremum problem of information theory,

    I. Csiszár, “On an extremum problem of information theory,”Studia Scientiarum Mathematicarum Hungarica, vol. 9, no. 1, pp. 57–71, 1974

  8. [8]

    An algorithm for computing the capacity of arbitrary discrete memoryless channels,

    S. Arimoto, “An algorithm for computing the capacity of arbitrary discrete memoryless channels,”IEEE Trans. Inf. Theory, vol. 18, no. 1, pp. 14–20, 1972

  9. [9]

    Rate-distortion theory for general sets and measures,

    E. Riegler, H. Bölcskei, and G. Koliander, “Rate-distortion theory for general sets and measures,” inProc. lEEE Int. Symp. Inf. Theory (ISIT), Vail, CO, USA, 2018, pp. 101–105

  10. [10]

    Lossy compression of general random variables,

    E. Riegler, G. Koliander, and H. Bölcskei, “Lossy compression of general random variables,”Information and Inference: A Journal of the IMA, vol. 12, no. 3, pp. 1759–1829, 2023

  11. [11]

    The rate-distortion dimension of sets and measures,

    T. Kawabata and A. Dembo, “The rate-distortion dimension of sets and measures,”IEEE Trans. Inf. Theory, vol. 40, no. 5, pp. 1564–1572, 1994

  12. [12]

    Successive refinement of abstract sources,

    V . Kostina and E. Tuncel, “Successive refinement of abstract sources,” IEEE Trans. Inf. Theory, vol. 65, no. 10, pp. 6385–6398, 2019

  13. [13]

    Asymptotic evaluation of certain Markov process expectations for large time, I,

    M. D. Donsker and S. S. Varadhan, “Asymptotic evaluation of certain Markov process expectations for large time, I,”Commun. Pure Appl. Math., vol. 28, no. 1, pp. 1–47, 1975

  14. [14]

    Successive refinement of information,

    W. H. Equitz and T. M. Cover, “Successive refinement of information,” IEEE Trans. Inf. Theory, vol. 37, no. 2, pp. 269–275, 1991

  15. [15]

    Kantorovich duality for general transport costs and applications,

    N. Gozlan, C. Roberto, P.-M. Samson, and P. Tetali, “Kantorovich duality for general transport costs and applications,”J. Funct. Anal., vol. 273, no. 11, pp. 3327–3405, 2017

  16. [16]

    A revisit to rate-distortion theory via optimal weak transport,

    J. Zou, L. Fan, J. Gao, and J. Wang, “A revisit to rate-distortion theory via optimal weak transport,”arXiv:2501.09362, 2025

  17. [17]

    Rate distortion theory for general sources with potential application to image compression,

    F. Rezaei, N. Ahmed, and C. D. Charalambous, “Rate distortion theory for general sources with potential application to image compression,” International Journal of Applied Mathematical Sciences, vol. 3, no. 2, pp. 141–165, 2006

  18. [18]

    Polyanskiy and Y

    Y . Polyanskiy and Y . Wu,Information Theory: From Coding to Learn- ing. Cambridge University Press, 2025

  19. [19]

    The concentration-compactness principle in the calculus of variations. The locally compact case, part 1,

    P.-L. Lions, “The concentration-compactness principle in the calculus of variations. The locally compact case, part 1,”Annales de l’Institut Henri Poincaré C, Analyse non linéaire, vol. 1, no. 2, pp. 109–145, 1984

  20. [20]

    The concentration-compactness principle in the calculus of variations. The locally compact case, part 2,

    ——, “The concentration-compactness principle in the calculus of variations. The locally compact case, part 2,”Annales de l’Institut Henri Poincaré C, Analyse non linéaire, vol. 1, no. 4, pp. 223–283, 1984

  21. [21]

    The concentration-compactness principle in the calculus of variations. The limit case, part 1,

    ——, “The concentration-compactness principle in the calculus of variations. The limit case, part 1,”Revista Matemática Iberoamericana, vol. 1, no. 1, pp. 145–201, 1985

  22. [22]

    The concentration-compactness principle in the calculus of variations. The limit case, part 2,

    ——, “The concentration-compactness principle in the calculus of variations. The limit case, part 2,”Revista Matemática Iberoamericana, vol. 1, no. 2, pp. 45–121, 1985

  23. [23]

    R. M. Gray,Entropy and Information Theory, 2nd ed. New York: Springer, 2011

  24. [24]

    De-noising by soft-thresholding,

    D. L. Donoho, “De-noising by soft-thresholding,”IEEE Trans. Inf. Theory, vol. 41, no. 3, pp. 613–627, 1995

  25. [25]

    Discontinuous piecewise linear optimiza- tion,

    A. R. Conn and M. Mongeau, “Discontinuous piecewise linear optimiza- tion,”Mathematical programming, vol. 80, no. 3, pp. 315–380, 1998

  26. [26]

    Discontinuous optimization,

    M. Mongeau, “Discontinuous optimization,” inEncyclopedia of Opti- mization. Springer, 2008, pp. 739–744

  27. [27]

    Discontinuous optimization in batch production using sumt,

    I. Imo and D. Leech, “Discontinuous optimization in batch production using sumt,”The International Journal of Production Research, vol. 22, no. 2, pp. 313–321, 1984

  28. [28]

    A simplex algorithm for piecewise-linear programming i: Derivation and proof,

    R. Fourer, “A simplex algorithm for piecewise-linear programming i: Derivation and proof,”Mathematical Programming, vol. 33, no. 2, pp. 204–233, 1985

  29. [29]

    Über systeme von liearren partiellen differentialgleichun- gen erster ordnung,

    W.-L. Chow, “Über systeme von liearren partiellen differentialgleichun- gen erster ordnung,”Mathematische Annalen, vol. 117, no. 1, pp. 98– 105, 1940

  30. [30]

    About connecting two points of a completely nonholo- nomic space by admissible curve,

    P. K. Rashevsky, “About connecting two points of a completely nonholo- nomic space by admissible curve,”Uch. Zapiski Ped. Inst. Libknechta, vol. 2, pp. 83–94, 1938

  31. [31]

    Montgomery,A Tour of Subriemannian Geometries, Their Geodesics and Applications

    R. Montgomery,A Tour of Subriemannian Geometries, Their Geodesics and Applications. Providence, RI: American Mathematical Soc., 2002, vol. 91

  32. [32]

    Lecture notes on sub-Riemannian geometry,

    E. Le Donne, “Lecture notes on sub-Riemannian geometry,” Centro di Ricerca Matematica Ennio de Giorgi, Scuola Normale Superiore, 2023, available online: https://cvgmt.sns.it/paper/5339/

  33. [33]

    M. P. Do Carmo and J. Flaherty Francis,Riemannian Geometry. Springer, 1992, vol. 2

  34. [34]

    Nonanticipative rate distortion function and relations to filtering theory,

    C. D. Charalambous, P. A. Stavrou, and N. U. Ahmed, “Nonanticipative rate distortion function and relations to filtering theory,”IEEE Transac- tions on Automatic Control, vol. 59, no. 4, pp. 937–952, 2014

  35. [35]

    Directed information on abstract spaces: Properties and variational equalities,

    C. D. Charalambous and P. A. Stavrou, “Directed information on abstract spaces: Properties and variational equalities,”IEEE Trans. Inf. Theory, vol. 62, no. 11, pp. 6019–6052, 2016

  36. [36]

    Optimal estimation via nonanticipative rate distortion function and applications to time-varying gauss–markov processes,

    P. A. Stavrou, T. Charalambous, C. D. Charalambous, and S. Loyka, “Optimal estimation via nonanticipative rate distortion function and applications to time-varying gauss–markov processes,”SIAM Journal on Control and Optimization, vol. 56, no. 5, pp. 3731–3765, 2018

  37. [37]

    Complete characterization of gorbunov and pinsker nonan- ticipatory epsilon entropy of multivariate gaussian sources: Structural properties,

    C. D. Charalambous, T. Charalambous, C. Kourtellaris, and J. H. van Schuppen, “Complete characterization of gorbunov and pinsker nonan- ticipatory epsilon entropy of multivariate gaussian sources: Structural properties,”IEEE Trans. Inf. Theory, vol. 68, no. 3, pp. 1440–1464, 2022

  38. [38]

    Struwe,Variational Methods: Applications to Nonlinear Partial Differential Equations and Hamiltonian Systems

    M. Struwe,Variational Methods: Applications to Nonlinear Partial Differential Equations and Hamiltonian Systems. Berlin, Heidelberg: Springer-Verlag, 2000, vol. 306. APPENDIXA PROOF OFTHEOREM6 The proof follows the classical strategy of [19], [20], [38], adapted to the abstract Polish-space setting. We begin by introducing the notion of the concentration ...

  39. [39]

    Fix0< ϵ < 1

    For eachm∈N, choosex m ∈ X satisfying Qm(R0)≤ Z BR0 (xm) dµm + 1 m ,lim m→∞ Qm(R0) =Q(R 0). Fix0< ϵ < 1

  40. [40]

    ChooseR >0such thatQ(R)>1−ϵ > 1

  41. [41]

    ThenZ BR0 (xm) dµm + Z BR(ym) dµm ≥Q m(R0) +Q m(R)− 2 m →Q(R 0) +Q(R) > 1 2 + 1 2 = 1 = Z X dµm,(15) for all sufficiently largem

    For eachm, lety m ∈ Xsatisfy Qm(R)≤ Z BR(ym) dµm + 1 m ,lim m→∞ Qm(R) =Q(R), where the choice ofy m may depend onϵ. ThenZ BR0 (xm) dµm + Z BR(ym) dµm ≥Q m(R0) +Q m(R)− 2 m →Q(R 0) +Q(R) > 1 2 + 1 2 = 1 = Z X dµm,(15) for all sufficiently largem. Hence the ballsB R0(xm)andB R(ym)must intersect. Otherwise, Z BR0 (xm) dµm + Z BR(ym) dµm = Z BR0 (xm)∪BR(ym) d...