Rate-distortion Theory with Lower Semi-continuous Distortion on Noncompact Alphabets

Jia Wang; Jiayang Gao; Jiayang Zou; Luyao Fan

arxiv: 2601.07246 · v3 · submitted 2026-01-12 · 💻 cs.IT · math.IT

Rate-distortion Theory with Lower Semi-continuous Distortion on Noncompact Alphabets

Jiayang Zou , Luyao Fan , Jiayang Gao , Jia Wang This is my paper

Pith reviewed 2026-05-16 15:43 UTC · model grok-4.3

classification 💻 cs.IT math.IT

keywords rate-distortion theorylower semi-continuous distortionnoncompact alphabetsexistence of optimal distributionsPolish spacesone-point compactificationconcentration-compactness

0 comments

The pith

Rate-distortion infima are attained for lower semi-continuous distortions on locally compact Polish alphabets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes existence of optimal reconstruction distributions for rate-distortion problems when the reproduction alphabet is noncompact. The authors prove two complementary mechanisms under lower semi-continuity: one-point compactification works for bounded distortions, while concentration-compactness handles unbounded coercive distortions. These results apply directly to alphabets such as the real line that classical compactness-based theorems cannot cover. The paper supplies counterexamples showing the conditions are close to sharp.

Core claim

When the reproduction alphabet is a locally compact Polish space and the distortion function is lower semi-continuous, the rate-distortion infimum is attained. Bounded distortions yield attainment through the one-point compactification argument. Unbounded coercive distortions yield attainment through a concentration-compactness argument. Several counterexamples confirm that these attainability conditions are nearly necessary.

What carries the argument

Lower semi-continuity of the distortion function on a locally compact Polish reproduction alphabet, which permits one-point compactification for bounded cases or concentration-compactness for coercive unbounded cases to produce minimizing reconstruction distributions.

If this is right

The rate-distortion function is achieved by some reconstruction distribution whenever the stated conditions hold.
Optimal rate-distortion performance is attainable without requiring the reproduction alphabet to be compact.
Existence extends to common noncompact alphabets such as the real line under bounded or coercive distortions.
The two mechanisms together give a unified existence theorem for lower semi-continuous distortions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The results justify direct rate-distortion analysis of continuous-valued sources without preliminary discretization.
Similar existence arguments may apply to related quantities such as channel capacity under matching distortion constraints.
Numerical optimization procedures for finding optimal distributions gain theoretical support in these noncompact settings.

Load-bearing premise

The reproduction alphabet is a locally compact Polish space.

What would settle it

An explicit locally compact Polish alphabet together with a lower semi-continuous distortion for which no minimizing reconstruction distribution exists would falsify the attainability claim.

read the original abstract

In this paper, we study rate-distortion theory for general sources with an emphasis on the existence of optimal reconstruction distributions on noncompact alphabets. Classical attainability results typically rely on compactness of the reproduction alphabet together with continuity of the distortion function, which may fail in many noncompact settings. We identify two complementary existence mechanisms under lower semi-continuity on locally compact Polish alphabets. For bounded distortions, we prove that the rate-distortion infimum is attained via the one-point compactification argument. For unbounded coercive distortions, we establish existence via concentration-compactness. We also give several counterexamples showing that our attainability results are close to sharp. Our results provide a unified and transparent existence theorem for rate-distortion problems with lower semi-continuous distortions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper supplies existence theorems for rate-distortion on noncompact alphabets via one-point compactification and concentration-compactness.

read the letter

The main thing to know is that this paper proves the rate-distortion infimum is attained for lower semi-continuous distortions on locally compact Polish alphabets. It splits the problem into bounded distortions, handled by one-point compactification, and unbounded coercive distortions, handled by concentration-compactness, plus counterexamples that show the conditions are nearly sharp. This extends the classical compact-alphabet results without requiring continuity of the distortion or compactness of the reproduction space. The approach is a direct, useful addition for settings like real-valued or integer alphabets that come up in practice. The outline in the abstract is clean and the reliance on standard Polish-space properties keeps the derivations straightforward with no circularity or fitted parameters. The bounded case looks especially solid because compactification sidesteps tightness questions. The unbounded case is where the stress-test concern lands: pointwise coercivity does not automatically produce tightness when the source measure itself has escaping mass, so the weak-convergence step for the mutual-information functional needs the concentration-compactness argument to be fully rigorous. If the full proofs close that gap, the issue is minor; otherwise it is the spot that referees should examine first. Overall the work is honest and addresses a real gap. It is the sort of paper that information theorists working on general alphabets will want to cite when they need existence without truncation. I would bring it to a reading group focused on foundations. It deserves peer review because the contribution is concrete and the technical claims are checkable.

Referee Report

2 major / 2 minor

Summary. The paper establishes existence of optimal reproduction distributions attaining the rate-distortion infimum for general sources when the reproduction alphabet is a locally compact Polish space and the distortion measure is lower semi-continuous. It provides two complementary mechanisms: one-point compactification for bounded distortions, and concentration-compactness for unbounded coercive distortions. Several counterexamples are given to show that the attainability results are close to sharp, yielding a unified existence theorem beyond the classical compact-alphabet setting.

Significance. If the central claims hold, the results meaningfully extend classical rate-distortion attainability theorems (which rely on compactness and continuity) to noncompact alphabets under only lower semi-continuity. The explicit proof strategies, use of standard topological tools, and sharpness counterexamples constitute a solid contribution to information theory on general alphabets, with potential applicability to source coding problems involving unbounded or noncompact reproduction spaces.

major comments (2)

[unbounded coercive distortions section] Unbounded coercive case (concentration-compactness argument): pointwise coercivity (lim_{y→∞} d(x,y)=∞ for each fixed x) does not automatically imply tightness of minimizing sequences {Y_n} when the source μ itself has escaping mass. Without uniformity in x, Markov-type bounds on E_μ[d(X,Y_n)] may fail to control mass escape in Y, so weak convergence in P(Y) is not guaranteed and the attainment step remains open. This directly affects the central existence claim for the unbounded case.
[bounded distortions section] The one-point compactification argument for bounded distortions is less exposed but should still verify that the extended distortion remains lower semi-continuous at the point at infinity and that the infimum is preserved under the compactification.

minor comments (2)

[introduction] Notation for the rate-distortion functional and the sets of admissible reproduction measures should be introduced with explicit definitions early in the paper to improve readability.
[counterexamples section] A few counterexamples could benefit from explicit numerical values or simple concrete alphabets (e.g., R with quadratic distortion) to make the sharpness claims more transparent.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for the insightful comments and the recommendation for major revision. We address each major comment below and will make the necessary revisions to strengthen the manuscript.

read point-by-point responses

Referee: [unbounded coercive distortions section] Unbounded coercive case (concentration-compactness argument): pointwise coercivity (lim_{y→∞} d(x,y)=∞ for each fixed x) does not automatically imply tightness of minimizing sequences {Y_n} when the source μ itself has escaping mass. Without uniformity in x, Markov-type bounds on E_μ[d(X,Y_n)] may fail to control mass escape in Y, so weak convergence in P(Y) is not guaranteed and the attainment step remains open. This directly affects the central existence claim for the unbounded case.

Authors: We thank the referee for highlighting this potential gap. We agree that pointwise coercivity requires additional care to ensure tightness of the minimizing sequences. In the revised manuscript, we will strengthen the argument by first using the tightness of the source measure μ (guaranteed since the alphabet is Polish) to truncate the source to a compact set with high probability. On this compact set, the lower semi-continuity of the distortion allows us to establish effective uniformity of the coercivity for the purpose of applying Markov's inequality to control the escape of mass in Y_n. We will include this detailed step in the concentration-compactness proof. This addresses the concern and confirms the attainment result under the stated conditions. revision: yes
Referee: [bounded distortions section] The one-point compactification argument for bounded distortions is less exposed but should still verify that the extended distortion remains lower semi-continuous at the point at infinity and that the infimum is preserved under the compactification.

Authors: We thank the referee for this constructive comment. We will revise the bounded distortions section to explicitly verify that the one-point compactification preserves the lower semi-continuity of the distortion measure at the added point at infinity. Specifically, we will show that the extended distortion function is lower semi-continuous on the compactified space. Furthermore, we will demonstrate that the rate-distortion infimum remains unchanged under this compactification, as measures assigning positive mass to the infinity point can be approximated by sequences of measures supported on the original alphabet without increasing the mutual information or the expected distortion. These verifications will be added to the manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity: standard topological arguments for existence

full rationale

The paper derives existence of rate-distortion minimizers via one-point compactification (bounded case) and concentration-compactness (unbounded coercive case) on locally compact Polish spaces under lower semi-continuity. These steps invoke classical results from topology and measure theory without any self-definitional reductions, fitted parameters renamed as predictions, or load-bearing self-citations. All claims rest on external, independently verifiable properties of the spaces and distortion functions rather than reducing to the paper's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on two domain assumptions about the alphabet and distortion; no free parameters or invented entities are introduced.

axioms (2)

domain assumption The reproduction alphabet is a locally compact Polish space.
Invoked to enable one-point compactification and concentration-compactness arguments on the space.
domain assumption The distortion function is lower semi-continuous.
Key regularity condition used to guarantee attainment of the infimum in both cases.

pith-pipeline@v0.9.0 · 5432 in / 1200 out tokens · 52003 ms · 2026-05-16T15:43:46.765574+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Generalized Concentration-compactness Lemma I ... Compactness / Vanishing / Dichotomy alternatives

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages

[1]

A mathematical theory of communication,

C. E. Shannon, “A mathematical theory of communication,”The Bell System Technical Journal, vol. 27, no. 3, pp. 379–423, 1948

work page 1948
[2]

Coding theorems for a discrete source with a fidelity criterion,

——, “Coding theorems for a discrete source with a fidelity criterion,” International Convention Record, vol. 7, pp. 325–350, 1959

work page 1959
[3]

Berger,Rate-Distortion Theory: A Mathematical Basis for Data Compression

T. Berger,Rate-Distortion Theory: A Mathematical Basis for Data Compression. Englewood Cliffs: Prentice-Hall, 1971

work page 1971
[4]

Computation of channel capacity and rate-distortion func- tions,

R. Blahut, “Computation of channel capacity and rate-distortion func- tions,”IEEE Trans. Inf. Theory, vol. 18, no. 4, pp. 460–473, 1972

work page 1972
[5]

R. E. Blahut,Principles and Practice of Information Theory. Addison- Wesley Longman Publishing Co., Inc., 1987

work page 1987
[6]

T. M. Cover and J. A. Thomas,Elements of Information Theory, 2nd ed. New York, NY , USA: Wiley, 2006

work page 2006
[7]

On an extremum problem of information theory,

I. Csiszár, “On an extremum problem of information theory,”Studia Scientiarum Mathematicarum Hungarica, vol. 9, no. 1, pp. 57–71, 1974

work page 1974
[8]

An algorithm for computing the capacity of arbitrary discrete memoryless channels,

S. Arimoto, “An algorithm for computing the capacity of arbitrary discrete memoryless channels,”IEEE Trans. Inf. Theory, vol. 18, no. 1, pp. 14–20, 1972

work page 1972
[9]

Rate-distortion theory for general sets and measures,

E. Riegler, H. Bölcskei, and G. Koliander, “Rate-distortion theory for general sets and measures,” inProc. lEEE Int. Symp. Inf. Theory (ISIT), Vail, CO, USA, 2018, pp. 101–105

work page 2018
[10]

Lossy compression of general random variables,

E. Riegler, G. Koliander, and H. Bölcskei, “Lossy compression of general random variables,”Information and Inference: A Journal of the IMA, vol. 12, no. 3, pp. 1759–1829, 2023

work page 2023
[11]

The rate-distortion dimension of sets and measures,

T. Kawabata and A. Dembo, “The rate-distortion dimension of sets and measures,”IEEE Trans. Inf. Theory, vol. 40, no. 5, pp. 1564–1572, 1994

work page 1994
[12]

Successive refinement of abstract sources,

V . Kostina and E. Tuncel, “Successive refinement of abstract sources,” IEEE Trans. Inf. Theory, vol. 65, no. 10, pp. 6385–6398, 2019

work page 2019
[13]

Asymptotic evaluation of certain Markov process expectations for large time, I,

M. D. Donsker and S. S. Varadhan, “Asymptotic evaluation of certain Markov process expectations for large time, I,”Commun. Pure Appl. Math., vol. 28, no. 1, pp. 1–47, 1975

work page 1975
[14]

Successive refinement of information,

W. H. Equitz and T. M. Cover, “Successive refinement of information,” IEEE Trans. Inf. Theory, vol. 37, no. 2, pp. 269–275, 1991

work page 1991
[15]

Kantorovich duality for general transport costs and applications,

N. Gozlan, C. Roberto, P.-M. Samson, and P. Tetali, “Kantorovich duality for general transport costs and applications,”J. Funct. Anal., vol. 273, no. 11, pp. 3327–3405, 2017

work page 2017
[16]

A revisit to rate-distortion theory via optimal weak transport,

J. Zou, L. Fan, J. Gao, and J. Wang, “A revisit to rate-distortion theory via optimal weak transport,”arXiv:2501.09362, 2025

work page arXiv 2025
[17]

Rate distortion theory for general sources with potential application to image compression,

F. Rezaei, N. Ahmed, and C. D. Charalambous, “Rate distortion theory for general sources with potential application to image compression,” International Journal of Applied Mathematical Sciences, vol. 3, no. 2, pp. 141–165, 2006

work page 2006
[18]

Polyanskiy and Y

Y . Polyanskiy and Y . Wu,Information Theory: From Coding to Learn- ing. Cambridge University Press, 2025

work page 2025
[19]

The concentration-compactness principle in the calculus of variations. The locally compact case, part 1,

P.-L. Lions, “The concentration-compactness principle in the calculus of variations. The locally compact case, part 1,”Annales de l’Institut Henri Poincaré C, Analyse non linéaire, vol. 1, no. 2, pp. 109–145, 1984

work page 1984
[20]

The concentration-compactness principle in the calculus of variations. The locally compact case, part 2,

——, “The concentration-compactness principle in the calculus of variations. The locally compact case, part 2,”Annales de l’Institut Henri Poincaré C, Analyse non linéaire, vol. 1, no. 4, pp. 223–283, 1984

work page 1984
[21]

The concentration-compactness principle in the calculus of variations. The limit case, part 1,

——, “The concentration-compactness principle in the calculus of variations. The limit case, part 1,”Revista Matemática Iberoamericana, vol. 1, no. 1, pp. 145–201, 1985

work page 1985
[22]

The concentration-compactness principle in the calculus of variations. The limit case, part 2,

——, “The concentration-compactness principle in the calculus of variations. The limit case, part 2,”Revista Matemática Iberoamericana, vol. 1, no. 2, pp. 45–121, 1985

work page 1985
[23]

R. M. Gray,Entropy and Information Theory, 2nd ed. New York: Springer, 2011

work page 2011
[24]

De-noising by soft-thresholding,

D. L. Donoho, “De-noising by soft-thresholding,”IEEE Trans. Inf. Theory, vol. 41, no. 3, pp. 613–627, 1995

work page 1995
[25]

Discontinuous piecewise linear optimiza- tion,

A. R. Conn and M. Mongeau, “Discontinuous piecewise linear optimiza- tion,”Mathematical programming, vol. 80, no. 3, pp. 315–380, 1998

work page 1998
[26]

Discontinuous optimization,

M. Mongeau, “Discontinuous optimization,” inEncyclopedia of Opti- mization. Springer, 2008, pp. 739–744

work page 2008
[27]

Discontinuous optimization in batch production using sumt,

I. Imo and D. Leech, “Discontinuous optimization in batch production using sumt,”The International Journal of Production Research, vol. 22, no. 2, pp. 313–321, 1984

work page 1984
[28]

A simplex algorithm for piecewise-linear programming i: Derivation and proof,

R. Fourer, “A simplex algorithm for piecewise-linear programming i: Derivation and proof,”Mathematical Programming, vol. 33, no. 2, pp. 204–233, 1985

work page 1985
[29]

Über systeme von liearren partiellen differentialgleichun- gen erster ordnung,

W.-L. Chow, “Über systeme von liearren partiellen differentialgleichun- gen erster ordnung,”Mathematische Annalen, vol. 117, no. 1, pp. 98– 105, 1940

work page 1940
[30]

About connecting two points of a completely nonholo- nomic space by admissible curve,

P. K. Rashevsky, “About connecting two points of a completely nonholo- nomic space by admissible curve,”Uch. Zapiski Ped. Inst. Libknechta, vol. 2, pp. 83–94, 1938

work page 1938
[31]

Montgomery,A Tour of Subriemannian Geometries, Their Geodesics and Applications

R. Montgomery,A Tour of Subriemannian Geometries, Their Geodesics and Applications. Providence, RI: American Mathematical Soc., 2002, vol. 91

work page 2002
[32]

Lecture notes on sub-Riemannian geometry,

E. Le Donne, “Lecture notes on sub-Riemannian geometry,” Centro di Ricerca Matematica Ennio de Giorgi, Scuola Normale Superiore, 2023, available online: https://cvgmt.sns.it/paper/5339/

work page 2023
[33]

M. P. Do Carmo and J. Flaherty Francis,Riemannian Geometry. Springer, 1992, vol. 2

work page 1992
[34]

Nonanticipative rate distortion function and relations to filtering theory,

C. D. Charalambous, P. A. Stavrou, and N. U. Ahmed, “Nonanticipative rate distortion function and relations to filtering theory,”IEEE Transac- tions on Automatic Control, vol. 59, no. 4, pp. 937–952, 2014

work page 2014
[35]

Directed information on abstract spaces: Properties and variational equalities,

C. D. Charalambous and P. A. Stavrou, “Directed information on abstract spaces: Properties and variational equalities,”IEEE Trans. Inf. Theory, vol. 62, no. 11, pp. 6019–6052, 2016

work page 2016
[36]

Optimal estimation via nonanticipative rate distortion function and applications to time-varying gauss–markov processes,

P. A. Stavrou, T. Charalambous, C. D. Charalambous, and S. Loyka, “Optimal estimation via nonanticipative rate distortion function and applications to time-varying gauss–markov processes,”SIAM Journal on Control and Optimization, vol. 56, no. 5, pp. 3731–3765, 2018

work page 2018
[37]

Complete characterization of gorbunov and pinsker nonan- ticipatory epsilon entropy of multivariate gaussian sources: Structural properties,

C. D. Charalambous, T. Charalambous, C. Kourtellaris, and J. H. van Schuppen, “Complete characterization of gorbunov and pinsker nonan- ticipatory epsilon entropy of multivariate gaussian sources: Structural properties,”IEEE Trans. Inf. Theory, vol. 68, no. 3, pp. 1440–1464, 2022

work page 2022
[38]

Struwe,Variational Methods: Applications to Nonlinear Partial Differential Equations and Hamiltonian Systems

M. Struwe,Variational Methods: Applications to Nonlinear Partial Differential Equations and Hamiltonian Systems. Berlin, Heidelberg: Springer-Verlag, 2000, vol. 306. APPENDIXA PROOF OFTHEOREM6 The proof follows the classical strategy of [19], [20], [38], adapted to the abstract Polish-space setting. We begin by introducing the notion of the concentration ...

work page 2000
[39]

Fix0< ϵ < 1

For eachm∈N, choosex m ∈ X satisfying Qm(R0)≤ Z BR0 (xm) dµm + 1 m ,lim m→∞ Qm(R0) =Q(R 0). Fix0< ϵ < 1

work page
[40]

ChooseR >0such thatQ(R)>1−ϵ > 1

work page
[41]

ThenZ BR0 (xm) dµm + Z BR(ym) dµm ≥Q m(R0) +Q m(R)− 2 m →Q(R 0) +Q(R) > 1 2 + 1 2 = 1 = Z X dµm,(15) for all sufficiently largem

For eachm, lety m ∈ Xsatisfy Qm(R)≤ Z BR(ym) dµm + 1 m ,lim m→∞ Qm(R) =Q(R), where the choice ofy m may depend onϵ. ThenZ BR0 (xm) dµm + Z BR(ym) dµm ≥Q m(R0) +Q m(R)− 2 m →Q(R 0) +Q(R) > 1 2 + 1 2 = 1 = Z X dµm,(15) for all sufficiently largem. Hence the ballsB R0(xm)andB R(ym)must intersect. Otherwise, Z BR0 (xm) dµm + Z BR(ym) dµm = Z BR0 (xm)∪BR(ym) d...

work page

[1] [1]

A mathematical theory of communication,

C. E. Shannon, “A mathematical theory of communication,”The Bell System Technical Journal, vol. 27, no. 3, pp. 379–423, 1948

work page 1948

[2] [2]

Coding theorems for a discrete source with a fidelity criterion,

——, “Coding theorems for a discrete source with a fidelity criterion,” International Convention Record, vol. 7, pp. 325–350, 1959

work page 1959

[3] [3]

Berger,Rate-Distortion Theory: A Mathematical Basis for Data Compression

T. Berger,Rate-Distortion Theory: A Mathematical Basis for Data Compression. Englewood Cliffs: Prentice-Hall, 1971

work page 1971

[4] [4]

Computation of channel capacity and rate-distortion func- tions,

R. Blahut, “Computation of channel capacity and rate-distortion func- tions,”IEEE Trans. Inf. Theory, vol. 18, no. 4, pp. 460–473, 1972

work page 1972

[5] [5]

R. E. Blahut,Principles and Practice of Information Theory. Addison- Wesley Longman Publishing Co., Inc., 1987

work page 1987

[6] [6]

T. M. Cover and J. A. Thomas,Elements of Information Theory, 2nd ed. New York, NY , USA: Wiley, 2006

work page 2006

[7] [7]

On an extremum problem of information theory,

I. Csiszár, “On an extremum problem of information theory,”Studia Scientiarum Mathematicarum Hungarica, vol. 9, no. 1, pp. 57–71, 1974

work page 1974

[8] [8]

An algorithm for computing the capacity of arbitrary discrete memoryless channels,

S. Arimoto, “An algorithm for computing the capacity of arbitrary discrete memoryless channels,”IEEE Trans. Inf. Theory, vol. 18, no. 1, pp. 14–20, 1972

work page 1972

[9] [9]

Rate-distortion theory for general sets and measures,

E. Riegler, H. Bölcskei, and G. Koliander, “Rate-distortion theory for general sets and measures,” inProc. lEEE Int. Symp. Inf. Theory (ISIT), Vail, CO, USA, 2018, pp. 101–105

work page 2018

[10] [10]

Lossy compression of general random variables,

E. Riegler, G. Koliander, and H. Bölcskei, “Lossy compression of general random variables,”Information and Inference: A Journal of the IMA, vol. 12, no. 3, pp. 1759–1829, 2023

work page 2023

[11] [11]

The rate-distortion dimension of sets and measures,

T. Kawabata and A. Dembo, “The rate-distortion dimension of sets and measures,”IEEE Trans. Inf. Theory, vol. 40, no. 5, pp. 1564–1572, 1994

work page 1994

[12] [12]

Successive refinement of abstract sources,

V . Kostina and E. Tuncel, “Successive refinement of abstract sources,” IEEE Trans. Inf. Theory, vol. 65, no. 10, pp. 6385–6398, 2019

work page 2019

[13] [13]

Asymptotic evaluation of certain Markov process expectations for large time, I,

M. D. Donsker and S. S. Varadhan, “Asymptotic evaluation of certain Markov process expectations for large time, I,”Commun. Pure Appl. Math., vol. 28, no. 1, pp. 1–47, 1975

work page 1975

[14] [14]

Successive refinement of information,

W. H. Equitz and T. M. Cover, “Successive refinement of information,” IEEE Trans. Inf. Theory, vol. 37, no. 2, pp. 269–275, 1991

work page 1991

[15] [15]

Kantorovich duality for general transport costs and applications,

N. Gozlan, C. Roberto, P.-M. Samson, and P. Tetali, “Kantorovich duality for general transport costs and applications,”J. Funct. Anal., vol. 273, no. 11, pp. 3327–3405, 2017

work page 2017

[16] [16]

A revisit to rate-distortion theory via optimal weak transport,

J. Zou, L. Fan, J. Gao, and J. Wang, “A revisit to rate-distortion theory via optimal weak transport,”arXiv:2501.09362, 2025

work page arXiv 2025

[17] [17]

Rate distortion theory for general sources with potential application to image compression,

F. Rezaei, N. Ahmed, and C. D. Charalambous, “Rate distortion theory for general sources with potential application to image compression,” International Journal of Applied Mathematical Sciences, vol. 3, no. 2, pp. 141–165, 2006

work page 2006

[18] [18]

Polyanskiy and Y

Y . Polyanskiy and Y . Wu,Information Theory: From Coding to Learn- ing. Cambridge University Press, 2025

work page 2025

[19] [19]

The concentration-compactness principle in the calculus of variations. The locally compact case, part 1,

P.-L. Lions, “The concentration-compactness principle in the calculus of variations. The locally compact case, part 1,”Annales de l’Institut Henri Poincaré C, Analyse non linéaire, vol. 1, no. 2, pp. 109–145, 1984

work page 1984

[20] [20]

The concentration-compactness principle in the calculus of variations. The locally compact case, part 2,

——, “The concentration-compactness principle in the calculus of variations. The locally compact case, part 2,”Annales de l’Institut Henri Poincaré C, Analyse non linéaire, vol. 1, no. 4, pp. 223–283, 1984

work page 1984

[21] [21]

The concentration-compactness principle in the calculus of variations. The limit case, part 1,

——, “The concentration-compactness principle in the calculus of variations. The limit case, part 1,”Revista Matemática Iberoamericana, vol. 1, no. 1, pp. 145–201, 1985

work page 1985

[22] [22]

The concentration-compactness principle in the calculus of variations. The limit case, part 2,

——, “The concentration-compactness principle in the calculus of variations. The limit case, part 2,”Revista Matemática Iberoamericana, vol. 1, no. 2, pp. 45–121, 1985

work page 1985

[23] [23]

R. M. Gray,Entropy and Information Theory, 2nd ed. New York: Springer, 2011

work page 2011

[24] [24]

De-noising by soft-thresholding,

D. L. Donoho, “De-noising by soft-thresholding,”IEEE Trans. Inf. Theory, vol. 41, no. 3, pp. 613–627, 1995

work page 1995

[25] [25]

Discontinuous piecewise linear optimiza- tion,

A. R. Conn and M. Mongeau, “Discontinuous piecewise linear optimiza- tion,”Mathematical programming, vol. 80, no. 3, pp. 315–380, 1998

work page 1998

[26] [26]

Discontinuous optimization,

M. Mongeau, “Discontinuous optimization,” inEncyclopedia of Opti- mization. Springer, 2008, pp. 739–744

work page 2008

[27] [27]

Discontinuous optimization in batch production using sumt,

I. Imo and D. Leech, “Discontinuous optimization in batch production using sumt,”The International Journal of Production Research, vol. 22, no. 2, pp. 313–321, 1984

work page 1984

[28] [28]

A simplex algorithm for piecewise-linear programming i: Derivation and proof,

R. Fourer, “A simplex algorithm for piecewise-linear programming i: Derivation and proof,”Mathematical Programming, vol. 33, no. 2, pp. 204–233, 1985

work page 1985

[29] [29]

Über systeme von liearren partiellen differentialgleichun- gen erster ordnung,

W.-L. Chow, “Über systeme von liearren partiellen differentialgleichun- gen erster ordnung,”Mathematische Annalen, vol. 117, no. 1, pp. 98– 105, 1940

work page 1940

[30] [30]

About connecting two points of a completely nonholo- nomic space by admissible curve,

P. K. Rashevsky, “About connecting two points of a completely nonholo- nomic space by admissible curve,”Uch. Zapiski Ped. Inst. Libknechta, vol. 2, pp. 83–94, 1938

work page 1938

[31] [31]

Montgomery,A Tour of Subriemannian Geometries, Their Geodesics and Applications

R. Montgomery,A Tour of Subriemannian Geometries, Their Geodesics and Applications. Providence, RI: American Mathematical Soc., 2002, vol. 91

work page 2002

[32] [32]

Lecture notes on sub-Riemannian geometry,

E. Le Donne, “Lecture notes on sub-Riemannian geometry,” Centro di Ricerca Matematica Ennio de Giorgi, Scuola Normale Superiore, 2023, available online: https://cvgmt.sns.it/paper/5339/

work page 2023

[33] [33]

M. P. Do Carmo and J. Flaherty Francis,Riemannian Geometry. Springer, 1992, vol. 2

work page 1992

[34] [34]

Nonanticipative rate distortion function and relations to filtering theory,

C. D. Charalambous, P. A. Stavrou, and N. U. Ahmed, “Nonanticipative rate distortion function and relations to filtering theory,”IEEE Transac- tions on Automatic Control, vol. 59, no. 4, pp. 937–952, 2014

work page 2014

[35] [35]

Directed information on abstract spaces: Properties and variational equalities,

C. D. Charalambous and P. A. Stavrou, “Directed information on abstract spaces: Properties and variational equalities,”IEEE Trans. Inf. Theory, vol. 62, no. 11, pp. 6019–6052, 2016

work page 2016

[36] [36]

Optimal estimation via nonanticipative rate distortion function and applications to time-varying gauss–markov processes,

P. A. Stavrou, T. Charalambous, C. D. Charalambous, and S. Loyka, “Optimal estimation via nonanticipative rate distortion function and applications to time-varying gauss–markov processes,”SIAM Journal on Control and Optimization, vol. 56, no. 5, pp. 3731–3765, 2018

work page 2018

[37] [37]

Complete characterization of gorbunov and pinsker nonan- ticipatory epsilon entropy of multivariate gaussian sources: Structural properties,

C. D. Charalambous, T. Charalambous, C. Kourtellaris, and J. H. van Schuppen, “Complete characterization of gorbunov and pinsker nonan- ticipatory epsilon entropy of multivariate gaussian sources: Structural properties,”IEEE Trans. Inf. Theory, vol. 68, no. 3, pp. 1440–1464, 2022

work page 2022

[38] [38]

Struwe,Variational Methods: Applications to Nonlinear Partial Differential Equations and Hamiltonian Systems

M. Struwe,Variational Methods: Applications to Nonlinear Partial Differential Equations and Hamiltonian Systems. Berlin, Heidelberg: Springer-Verlag, 2000, vol. 306. APPENDIXA PROOF OFTHEOREM6 The proof follows the classical strategy of [19], [20], [38], adapted to the abstract Polish-space setting. We begin by introducing the notion of the concentration ...

work page 2000

[39] [39]

Fix0< ϵ < 1

For eachm∈N, choosex m ∈ X satisfying Qm(R0)≤ Z BR0 (xm) dµm + 1 m ,lim m→∞ Qm(R0) =Q(R 0). Fix0< ϵ < 1

work page

[40] [40]

ChooseR >0such thatQ(R)>1−ϵ > 1

work page

[41] [41]

ThenZ BR0 (xm) dµm + Z BR(ym) dµm ≥Q m(R0) +Q m(R)− 2 m →Q(R 0) +Q(R) > 1 2 + 1 2 = 1 = Z X dµm,(15) for all sufficiently largem

For eachm, lety m ∈ Xsatisfy Qm(R)≤ Z BR(ym) dµm + 1 m ,lim m→∞ Qm(R) =Q(R), where the choice ofy m may depend onϵ. ThenZ BR0 (xm) dµm + Z BR(ym) dµm ≥Q m(R0) +Q m(R)− 2 m →Q(R 0) +Q(R) > 1 2 + 1 2 = 1 = Z X dµm,(15) for all sufficiently largem. Hence the ballsB R0(xm)andB R(ym)must intersect. Otherwise, Z BR0 (xm) dµm + Z BR(ym) dµm = Z BR0 (xm)∪BR(ym) d...

work page