Optimally Bridging Semantics and Data: Generative Semantic Communication via Schr\"odinger Bridge

Dahua Gao; Guangming Shi; Minxi Yang; Ruichao Liu; Shuai Ma; Youlong Wu

arxiv: 2604.17802 · v1 · submitted 2026-04-20 · 📡 eess.IV · cs.CV

Optimally Bridging Semantics and Data: Generative Semantic Communication via Schr\"odinger Bridge

Dahua Gao , Ruichao Liu , Minxi Yang , Shuai Ma , Youlong Wu , Guangming Shi This is my paper

Pith reviewed 2026-05-10 04:12 UTC · model grok-4.3

classification 📡 eess.IV cs.CV

keywords generative semantic communicationSchrödinger Bridgeoptimal transportdiffusion modelsimage transmissionhallucination reductionself-consistency training

0 comments

The pith

Schrödinger Bridge constructs direct optimal transport paths from semantics to images for generative semantic communication.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to replace the long, indirect trajectories of standard diffusion models in generative semantic communication with shorter optimal paths given by the Schrödinger Bridge. Existing approaches start from a Gaussian noise distribution and follow guided diffusion to reach image distributions conditioned on semantics, which lengthens computation and allows semantic hallucinations to accumulate. By solving the bridge problem between arbitrary distributions, the method enables direct decoding without the Gaussian starting point. This matters for narrowband channels because shorter trajectories use less bandwidth and computation while preserving semantic fidelity. The authors implement this idea in a diffusion Schrödinger Bridge variant that recovers the required nonlinear dynamics and adds a self-consistency loss to further shorten sampling.

Core claim

The central claim is that the Schrödinger Bridge supplies the optimal stochastic process connecting a semantic distribution to an image distribution, allowing direct generative decoding in GSC. Within this framework the diffusion Schrödinger Bridge variant reconstructs the nonlinear drift term of the underlying diffusion model from Schrödinger potentials, and a self-consistency objective trains a velocity field that points straight to the target image, eliminating Markovian noise prediction and thereby reducing the number of sampling steps required.

What carries the argument

The Schrödinger Bridge, the entropy-regularized optimal transport process that finds the most probable trajectory between any two given marginal distributions.

If this is right

Generative decoding can start directly from semantics rather than from Gaussian noise, removing an unnecessary intermediate distribution.
Hallucination is reduced because the transport path is the shortest in the sense of the Schrödinger problem rather than a long diffusion chain.
Inference requires far fewer steps once a nonlinear velocity field is learned via the self-consistency objective.
The same bridge construction applies to any pair of distributions, not only those reachable from a Gaussian prior.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same optimal-transport framing could be applied to semantic transmission of video or point-cloud data where long diffusion chains are equally costly.
On edge devices the reduced step count might make real-time semantic decoding feasible without cloud offload.
One could test whether replacing the self-consistency loss with an explicit Wasserstein penalty produces still shorter paths or different fidelity trade-offs.

Load-bearing premise

The nonlinear drift of the diffusion process can be recovered exactly from the Schrödinger potentials so that the resulting trajectories are truly optimal and free of approximation errors that would reintroduce hallucinations.

What would settle it

Measure the actual transport cost or path length between the semantic and image distributions on held-out data; if the SB trajectories are not shorter than standard diffusion paths while hallucination rates stay the same or rise, the optimality claim does not hold.

Figures

Figures reproduced from arXiv: 2604.17802 by Dahua Gao, Guangming Shi, Minxi Yang, Ruichao Liu, Shuai Ma, Youlong Wu.

**Figure 2.** Figure 2: Schematic diagram of the proposed SBGSC framework. The framework mainly consists of a joint source-channel semantic encoder, a [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: The overall architecture of proposed DSBGSC.The optimal [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Visual comparison of semantic perception quality among different methods under AWGN channel at SNR = 7dB. [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison of semantic perception quality among different [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Visual comparison of semantic perception quality among different methods under AWGN channel with CBR = 1/48. [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 8.** Figure 8: Visual comparison of hallucination suppression. Red boxes [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

**Figure 9.** Figure 9: Generative processes for semantic and data distribution transfer with NFE=10. Each figure depicts the direct prediction performance of [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗

**Figure 10.** Figure 10: Comparison of semantic perceptual quality under different [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗

read the original abstract

Generative Semantic Communication (GSC) is a promising solution for image transmission over narrow-band and high-noise channels. However, existing GSC methods rely on long, indirect transport trajectories from a Gaussian to an image distribution guided by semantics, causing severe hallucination and high computational cost. To address this, we propose a general framework named Schr\"odinger Bridge-based GSC (SBGSC). By leveraging the Schr\"odinger Bridge (SB) to construct optimal transport trajectories between arbitrary distributions, SBGSC breaks Gaussian limitations and enables direct generative decoding from semantics to images. Within this framework, we design Diffusion SB-based GSC (DSBGSC). DSBGSC reconstructs the nonlinear drift term of diffusion models using Schr\"odinger potentials, achieving direct optimal distribution transport to reduce hallucinations and computational overhead. To further accelerate generation, we propose a self-consistency-based objective guiding the model to learn a nonlinear velocity field pointing directly toward the image, bypassing Markovian noise prediction to significantly reduce sampling steps. Simulation results demonstrate that DSBGSC outperforms state-of-the-art GSC methods, improving FID by at least 38% and SSIM by 49.3%, while accelerating inference speed by over 8 times.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Schrödinger Bridge is used for direct semantic-to-image transport in GSC, with a self-consistency velocity trick for speed, but the optimality of the reconstructed drift is not clearly verified.

read the letter

The paper's main contribution is applying the Schrödinger Bridge to generative semantic communication so that decoding can follow a direct optimal transport path from semantic features to images instead of the usual long Gaussian-to-image diffusion trajectory. They instantiate this as DSBGSC, which reconstructs the nonlinear drift of a diffusion model from Schrödinger potentials and adds a self-consistency objective that trains a velocity field pointing straight at the target image, cutting the number of sampling steps.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a Schrödinger Bridge-based Generative Semantic Communication (SBGSC) framework and its diffusion instantiation (DSBGSC). It reconstructs the nonlinear drift of diffusion models from Schrödinger potentials to realize direct optimal transport trajectories between semantic and image distributions (bypassing Gaussian intermediaries), and introduces a self-consistency velocity objective to learn a nonlinear field that reduces sampling steps. Simulation results are claimed to demonstrate at least 38% FID improvement, 49.3% SSIM improvement, and >8× faster inference over prior GSC methods.

Significance. If the optimality of the reconstructed transport is established, the work would provide a theoretically grounded route to lower hallucination and latency in semantic image transmission over constrained channels. The integration of Schrödinger Bridge theory with diffusion drift reconstruction and self-consistency training is a non-trivial synthesis that could inform subsequent research at the intersection of optimal transport and generative semantic communications.

major comments (2)

[Abstract] Abstract: the central claim that Schrödinger-potential reconstruction of the diffusion drift 'achieves direct optimal distribution transport' is load-bearing for the hallucination-reduction argument, yet no verification is supplied (e.g., realized transport cost, marginal-matching error, or comparison to the exact SB solution) that the learned drift satisfies the SB optimality conditions rather than constituting an approximation.
[Abstract] The self-consistency velocity objective is presented as an independent accelerator, but its interaction with the SB-derived drift is not analyzed; it is therefore unclear whether the reported speed and fidelity gains derive from optimality or from the velocity-field training alone.

minor comments (1)

The experimental setup, datasets, channel models, and baseline implementations are not described in the abstract, impeding assessment of the reported FID/SSIM/speed numbers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and insightful comments. We address each major comment point by point below. The revisions strengthen the theoretical and empirical grounding of the optimality claims without altering the core contributions.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that Schrödinger-potential reconstruction of the diffusion drift 'achieves direct optimal distribution transport' is load-bearing for the hallucination-reduction argument, yet no verification is supplied (e.g., realized transport cost, marginal-matching error, or comparison to the exact SB solution) that the learned drift satisfies the SB optimality conditions rather than constituting an approximation.

Authors: We agree that explicit verification of the SB optimality conditions strengthens the central claim. The manuscript derives the nonlinear drift reconstruction from the Schrödinger potentials (Eqs. 8–12) to satisfy the SB optimality conditions by construction, but we acknowledge the absence of direct empirical checks. In the revised version we add Section 4.3 with (i) realized transport cost under the learned drift, (ii) marginal-matching error between source and target distributions, and (iii) numerical comparison against the exact SB solution obtained via the Sinkhorn algorithm on discretized marginals. These results confirm that the reconstructed drift closely tracks the optimal trajectory, supporting the reported hallucination reduction. revision: yes
Referee: [Abstract] The self-consistency velocity objective is presented as an independent accelerator, but its interaction with the SB-derived drift is not analyzed; it is therefore unclear whether the reported speed and fidelity gains derive from optimality or from the velocity-field training alone.

Authors: We thank the referee for highlighting the missing interaction analysis. The self-consistency objective is not independent; it is formulated on the velocity field obtained from the SB-derived drift (Eq. 15) so that the learned field remains consistent with the optimal transport path while bypassing Markovian noise prediction. In the revision we add Section 3.4 containing both a theoretical argument showing that the combined objective preserves the SB marginal-matching property and ablation experiments that isolate the contribution of each component. The results demonstrate that the largest gains in speed and fidelity occur only when the self-consistency training is applied to the SB drift, indicating synergy rather than isolated effects. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation chain; claims rest on independent SB formulation and new objective

full rationale

The paper's central steps—using Schrödinger Bridge to define optimal trajectories between semantic and image distributions, reconstructing drift via potentials, and adding a self-consistency velocity objective—are presented as direct applications of established SB theory plus a novel training signal. No step reduces by construction to a fitted parameter renamed as prediction, a self-citation chain, or a redefinition of the target metric. The self-consistency objective is introduced as an independent acceleration mechanism rather than tautologically equivalent to the optimality claim. Performance improvements are reported as empirical outcomes, not forced by the formulation itself. The derivation remains self-contained against external SB mathematics and diffusion baselines.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework assumes existence and computability of Schrödinger potentials between semantic and image distributions, plus that diffusion models can be reparameterized to follow the resulting bridge without loss of optimality.

axioms (2)

domain assumption Schrödinger Bridge exists and can be constructed between arbitrary distributions including semantic-conditioned image distributions
Invoked to justify direct optimal transport trajectories replacing Gaussian paths
domain assumption Diffusion model drift can be exactly reconstructed from Schrödinger potentials
Central to DSBGSC claim of optimal transport

pith-pipeline@v0.9.0 · 5536 in / 1346 out tokens · 34430 ms · 2026-05-10T04:12:32.901115+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages

[1]

Deep joint source-channel coding for wireless image transmission,

E. Bourtsoulatze, D. Burth Kurka, and D. G ¨und¨uz, “Deep joint source-channel coding for wireless image transmission,”IEEE Transactions on Cognitive Communications and Networking, vol. 5, no. 3, pp. 567–579, 2019

work page 2019
[2]

Recent contributions to the mathematical theory of communication,

W. Weaver, “Recent contributions to the mathematical theory of communication,”ETC: a review of general semantics, pp. 261– 281, 1953

work page 1953
[3]

From semantic communication to semantic-aware networking: Model, architecture, and open prob- lems,

G. Shi, Y . Xiao, Y . Li, and X. Xie, “From semantic communication to semantic-aware networking: Model, architecture, and open prob- lems,”IEEE Communications Magazine, vol. 59, no. 8, pp. 44–50, 2021

work page 2021
[4]

Neural joint source-channel coding,

K. Choi, K. Tatwawadi, A. Grover, T. Weissman, and S. Ermon, “Neural joint source-channel coding,” inProceedings of the 36th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97. PMLR, 09–15 Jun 2019, pp. 1182–1192

work page 2019
[5]

Swinjscc: Taming swin transformer for deep joint source-channel coding,

K. Yang, S. Wang, J. Dai, X. Qin, K. Niu, and P. Zhang, “Swinjscc: Taming swin transformer for deep joint source-channel coding,” IEEE Transactions on Cognitive Communications and Networking, vol. 11, no. 1, pp. 90–104, 2025

work page 2025
[6]

The perception-distortion tradeoff,

Y . Blau and T. Michaeli, “The perception-distortion tradeoff,” in2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 6228–6237

work page 2018
[7]

On the rate- distortion theory for task-specific semantic communication,

J. Chai, H. Zhu, Y . Xiao, G. Shi, and P. Zhang, “On the rate- distortion theory for task-specific semantic communication,”En- tropy, vol. 27, no. 8, p. 775, 2025

work page 2025
[8]

En- hancing semantic communication with deep generative models: An overview,

E. Grassucci, Y . Mitsufuji, P. Zhang, and D. Comminiello, “En- hancing semantic communication with deep generative models: An overview,” inICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 13 021–13 025

work page 2024
[9]

Generative semantic communication: Architectures, technologies, and applications,

J. Ren, Y . Sun, H. Du, W. Yuan, C. Wang, X. Wang, Y . Zhou, Z. Zhu, F. Wang, and S. Cui, “Generative semantic communication: Architectures, technologies, and applications,”Engineering, 2025

work page 2025
[10]

Score-based generative modeling through stochastic differential equations,

Y . Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differential equations,” inInternational Conference on Learning Representations, 2021

work page 2021
[11]

Conditional image synthesis with diffusion models: A survey,

Z. Zhan, D. Chen, J.-P. Mei, Z. Zhao, J. Chen, C. Chen, S. Lyu, and C. Wang, “Conditional image synthesis with diffusion models: A survey,”Transactions on Machine Learning Research, 2025, survey Certification. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 23

work page 2025
[12]

Sequential semantic gen- erative communication for progressive text-to-image generation,

H. Nam, J. Park, J. Choi, and S.-L. Kim, “Sequential semantic gen- erative communication for progressive text-to-image generation,” in2023 20th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON), 2023, pp. 91–94

work page 2023
[13]

Semantics-guided diffusion for deep joint source-channel coding in wireless image transmission,

M. Zhang, H. Wu, G. Zhu, R. Jin, X. Chen, and D. G ¨und¨uz, “Semantics-guided diffusion for deep joint source-channel coding in wireless image transmission,”IEEE Transactions on Wireless Communications, vol. 25, pp. 1547–1564, 2026

work page 2026
[14]

Bridging semantic scale gaps in image transmission through multi-scale joint perception and generation,

D. Gao, Y . Yi, M. Yang, J. Li, D. Liu, and W. Xu, “Bridging semantic scale gaps in image transmission through multi-scale joint perception and generation,”IEEE Wireless Communications Letters, vol. 14, no. 10, pp. 3314–3318, 2025

work page 2025
[15]

Understand- ing hallucinations in diffusion models through mode interpolation (2024),

S. K. Aithal, P. Maini, Z. C. Lipton, and J. Z. Kolter, “Understand- ing hallucinations in diffusion models through mode interpolation (2024),” vol. 2406

work page 2024
[16]

Denoising diffusion probabilis- tic models,

J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilis- tic models,”Advances in neural information processing systems, vol. 33, pp. 6840–6851, 2020

work page 2020
[17]

A Survey of the Schr¨odinger Problem and Some of Its Connections with Optimal Transport,

C. L ´eonard, “A Survey of the Schr¨odinger Problem and Some of Its Connections with Optimal Transport,”Dynamical Systems, vol. 34, no. 4, pp. 1533–1574, 2014

work page 2014
[18]

Semantic successive refinement: A generative ai-aided semantic communication framework,

K. Zhang, L. Li, W. Lin, Y . Yan, R. Li, W. Cheng, and Z. Han, “Semantic successive refinement: A generative ai-aided semantic communication framework,”IEEE Transactions on Cognitive Com- munications and Networking, vol. 11, no. 2, pp. 687–699, 2025

work page 2025
[19]

Wireless end-to- end image transmission system using semantic communications,

M. U. Lokumarambage, V . S. S. Gowrisetty, H. Rezaei, T. Sivalingam, N. Rajatheva, and A. Fernando, “Wireless end-to- end image transmission system using semantic communications,” IEEE Access, vol. 11, pp. 37 149–37 163, 2023

work page 2023
[20]

Take a close look at mode collapse and vanishing gradient in gan,

Z. Ding, S. Jiang, and J. Zhao, “Take a close look at mode collapse and vanishing gradient in gan,” in2022 IEEE 2nd International Conference on Electronic Technology, Communication and Infor- mation (ICETCI), 2022, pp. 597–602

work page 2022
[21]

Agent-driven generative semantic communication with cross-modality and prediction,

W. Yang, Z. Xiong, Y . Yuan, W. Jiang, T. Q. S. Quek, and M. Debbah, “Agent-driven generative semantic communication with cross-modality and prediction,”IEEE Transactions on Wireless Communications, vol. 24, no. 3, pp. 2233–2248, 2025

work page 2025
[22]

SG2SC: A Generative Semantic Communication Framework for Scene Understanding-Oriented Image Transmission,

M. Yang, D. Gao, F. Xie, J. Li, X. Song, and G. Shi, “SG2SC: A Generative Semantic Communication Framework for Scene Understanding-Oriented Image Transmission,” inICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 13 486–13 490

work page 2024
[23]

Lightweight diffusion models for resource-constrained semantic communication,

E. Grassucci, G. Pignata, G. Cicchetti, and D. Comminiello, “Lightweight diffusion models for resource-constrained semantic communication,”IEEE Wireless Communications Letters, vol. 14, no. 9, pp. 2743–2747, 2025

work page 2025
[24]

Transmit what you need: Task-adaptive semantic communications for visual information,

J. Park and S. W. Yoon, “Transmit what you need: Task-adaptive semantic communications for visual information,”IEEE Journal on Selected Areas in Communications, vol. 43, no. 12, pp. 4182–4197, 2025

work page 2025
[25]

The partial differential equation ut+ uux=µxx,

E. Hopf, “The partial differential equation ut+ uux=µxx,”Com- munications on Pure and Applied Mathematics, vol. 3, no. 3, pp. 201–230, 1950

work page 1950
[26]

On a quasi-linear parabolic equation occurring in aerodynamics,

J. D. Cole, “On a quasi-linear parabolic equation occurring in aerodynamics,”Quarterly of applied mathematics, vol. 9, no. 3, pp. 225–236, 1951

work page 1951
[27]

On the relation between optimal transport and schr ¨odinger bridges: A stochastic control viewpoint,

Y . Chen, T. T. Georgiou, and M. Pavon, “On the relation between optimal transport and schr ¨odinger bridges: A stochastic control viewpoint,”Journal of Optimization Theory and Applications, vol. 169, no. 2, pp. 671–691, 2016

work page 2016
[28]

Diffusion schr¨odinger bridge with applications to score-based generative modeling,

V . De Bortoli, J. Thornton, J. Heng, and A. Doucet, “Diffusion schr¨odinger bridge with applications to score-based generative modeling,”Advances in neural information processing systems, vol. 34, pp. 17 695–17 709, 2021

work page 2021
[29]

Likelihood Training of Schr¨odinger Bridge using Forward-Backward SDEs Theory,

T. Chen, G.-H. Liu, and E. Theodorou, “Likelihood Training of Schr¨odinger Bridge using Forward-Backward SDEs Theory,” in International Conference on Learning Representations, 2022

work page 2022
[30]

Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge,

C. Yue, Z. Peng, J. Ma, S. Du, P. Wei, and D. Zhang, “Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge,” in Proceedings of the 41st International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, vol. 235. PMLR, 2024, pp. 58 068–58 089

work page 2024
[31]

Denoising diffusion bridge models,

L. Zhou, A. Lou, S. Khanna, and S. Ermon, “Denoising diffusion bridge models,” inThe Twelfth International Conference on Learn- ing Representations, 2024

work page 2024
[32]

UniDB: A unified diffusion bridge framework via stochastic opti- mal control,

K. Zhu, M. Pan, Y . Ma, Y . Fu, J. Yu, J. Wang, and Y . Shi, “UniDB: A unified diffusion bridge framework via stochastic opti- mal control,” inForty-second International Conference on Machine Learning, 2025

work page 2025
[33]

An intuitive proof of the data processing inequality,

N. J. Beaudry and R. Renner, “An intuitive proof of the data processing inequality,”Quantum Information and Computation, vol. 12, no. 5&6, pp. 432–441, 2012

work page 2012
[34]

A class of wasserstein metrics for probability distributions

C. R. Givens and R. M. Shortt, “A class of wasserstein metrics for probability distributions.”Michigan Mathematical Journal, vol. 31, no. 2, pp. 231–240, 1984

work page 1984
[35]

I2sb: image-to-image schr ¨odinger bridge,

G.-H. Liu, A. Vahdat, D.-A. Huang, E. A. Theodorou, W. Nie, and A. Anandkumar, “I2sb: image-to-image schr ¨odinger bridge,” inProceedings of the 40th International Conference on Machine Learning, 2023, pp. 22 042–22 062

work page 2023
[36]

Consistency models,

Y . Song, P. Dhariwal, M. Chen, and I. Sutskever, “Consistency models,” inProceedings of the 40th International Conference on Machine Learning, 2023, pp. 32 211–32 252

work page 2023
[37]

Imagenet: A large-scale hierarchical image database,

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255

work page 2009
[38]

Multiscale structural similarity for image quality assessment,

Z. Wang, E. Simoncelli, and A. Bovik, “Multiscale structural similarity for image quality assessment,” inThe Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, vol. 2, 2003, pp. 1398–1402 V ol.2

work page 2003
[39]

The unreasonable effectiveness of deep features as a perceptual metric,

R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 586–595

work page 2018
[40]

Gans trained by a two time-scale update rule converge to a local nash equilibrium,

M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochre- iter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,”Advances in neural information processing systems, vol. 30, 2017

work page 2017
[41]

High perceptual quality wireless image delivery with denoising diffusion models,

S. F. Yilmaz, X. Niu, B. Bai, W. Han, L. Deng, and D. G ¨und¨uz, “High perceptual quality wireless image delivery with denoising diffusion models,” inIEEE INFOCOM 2024 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), 2024, pp. 1–5

work page 2024
[42]

Some inequalities satisfied by the quantities of information of fisher and shannon,

A. J. Stam, “Some inequalities satisfied by the quantities of information of fisher and shannon,”Information and Control, vol. 2, no. 2, pp. 101–112, 1959

work page 1959
[43]

Information and the accuracy attainable in the estimation of statistical parameters,

C. R. Raoet al., “Information and the accuracy attainable in the estimation of statistical parameters,”Bull. Calcutta Math. Soc, vol. 37, no. 3, pp. 81–91, 1945

work page 1945
[44]

The numerical solution of stochastic differential equations,

P. E. Kloeden and R. Pearson, “The numerical solution of stochastic differential equations,”The ANZIAM Journal, vol. 20, no. 1, pp. 8– 12, 1977

work page 1977
[45]

Note on the derivatives with respect to a parameter of the solutions of a system of differential equations,

T. H. Gronwall, “Note on the derivatives with respect to a parameter of the solutions of a system of differential equations,”Annals of Mathematics, vol. 20, no. 4, pp. 292–296, 1919

work page 1919

[1] [1]

Deep joint source-channel coding for wireless image transmission,

E. Bourtsoulatze, D. Burth Kurka, and D. G ¨und¨uz, “Deep joint source-channel coding for wireless image transmission,”IEEE Transactions on Cognitive Communications and Networking, vol. 5, no. 3, pp. 567–579, 2019

work page 2019

[2] [2]

Recent contributions to the mathematical theory of communication,

W. Weaver, “Recent contributions to the mathematical theory of communication,”ETC: a review of general semantics, pp. 261– 281, 1953

work page 1953

[3] [3]

From semantic communication to semantic-aware networking: Model, architecture, and open prob- lems,

G. Shi, Y . Xiao, Y . Li, and X. Xie, “From semantic communication to semantic-aware networking: Model, architecture, and open prob- lems,”IEEE Communications Magazine, vol. 59, no. 8, pp. 44–50, 2021

work page 2021

[4] [4]

Neural joint source-channel coding,

K. Choi, K. Tatwawadi, A. Grover, T. Weissman, and S. Ermon, “Neural joint source-channel coding,” inProceedings of the 36th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri and R. Salakhutdinov, Eds., vol. 97. PMLR, 09–15 Jun 2019, pp. 1182–1192

work page 2019

[5] [5]

Swinjscc: Taming swin transformer for deep joint source-channel coding,

K. Yang, S. Wang, J. Dai, X. Qin, K. Niu, and P. Zhang, “Swinjscc: Taming swin transformer for deep joint source-channel coding,” IEEE Transactions on Cognitive Communications and Networking, vol. 11, no. 1, pp. 90–104, 2025

work page 2025

[6] [6]

The perception-distortion tradeoff,

Y . Blau and T. Michaeli, “The perception-distortion tradeoff,” in2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 6228–6237

work page 2018

[7] [7]

On the rate- distortion theory for task-specific semantic communication,

J. Chai, H. Zhu, Y . Xiao, G. Shi, and P. Zhang, “On the rate- distortion theory for task-specific semantic communication,”En- tropy, vol. 27, no. 8, p. 775, 2025

work page 2025

[8] [8]

En- hancing semantic communication with deep generative models: An overview,

E. Grassucci, Y . Mitsufuji, P. Zhang, and D. Comminiello, “En- hancing semantic communication with deep generative models: An overview,” inICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 13 021–13 025

work page 2024

[9] [9]

Generative semantic communication: Architectures, technologies, and applications,

J. Ren, Y . Sun, H. Du, W. Yuan, C. Wang, X. Wang, Y . Zhou, Z. Zhu, F. Wang, and S. Cui, “Generative semantic communication: Architectures, technologies, and applications,”Engineering, 2025

work page 2025

[10] [10]

Score-based generative modeling through stochastic differential equations,

Y . Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differential equations,” inInternational Conference on Learning Representations, 2021

work page 2021

[11] [11]

Conditional image synthesis with diffusion models: A survey,

Z. Zhan, D. Chen, J.-P. Mei, Z. Zhao, J. Chen, C. Chen, S. Lyu, and C. Wang, “Conditional image synthesis with diffusion models: A survey,”Transactions on Machine Learning Research, 2025, survey Certification. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 23

work page 2025

[12] [12]

Sequential semantic gen- erative communication for progressive text-to-image generation,

H. Nam, J. Park, J. Choi, and S.-L. Kim, “Sequential semantic gen- erative communication for progressive text-to-image generation,” in2023 20th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON), 2023, pp. 91–94

work page 2023

[13] [13]

Semantics-guided diffusion for deep joint source-channel coding in wireless image transmission,

M. Zhang, H. Wu, G. Zhu, R. Jin, X. Chen, and D. G ¨und¨uz, “Semantics-guided diffusion for deep joint source-channel coding in wireless image transmission,”IEEE Transactions on Wireless Communications, vol. 25, pp. 1547–1564, 2026

work page 2026

[14] [14]

Bridging semantic scale gaps in image transmission through multi-scale joint perception and generation,

D. Gao, Y . Yi, M. Yang, J. Li, D. Liu, and W. Xu, “Bridging semantic scale gaps in image transmission through multi-scale joint perception and generation,”IEEE Wireless Communications Letters, vol. 14, no. 10, pp. 3314–3318, 2025

work page 2025

[15] [15]

Understand- ing hallucinations in diffusion models through mode interpolation (2024),

S. K. Aithal, P. Maini, Z. C. Lipton, and J. Z. Kolter, “Understand- ing hallucinations in diffusion models through mode interpolation (2024),” vol. 2406

work page 2024

[16] [16]

Denoising diffusion probabilis- tic models,

J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilis- tic models,”Advances in neural information processing systems, vol. 33, pp. 6840–6851, 2020

work page 2020

[17] [17]

A Survey of the Schr¨odinger Problem and Some of Its Connections with Optimal Transport,

C. L ´eonard, “A Survey of the Schr¨odinger Problem and Some of Its Connections with Optimal Transport,”Dynamical Systems, vol. 34, no. 4, pp. 1533–1574, 2014

work page 2014

[18] [18]

Semantic successive refinement: A generative ai-aided semantic communication framework,

K. Zhang, L. Li, W. Lin, Y . Yan, R. Li, W. Cheng, and Z. Han, “Semantic successive refinement: A generative ai-aided semantic communication framework,”IEEE Transactions on Cognitive Com- munications and Networking, vol. 11, no. 2, pp. 687–699, 2025

work page 2025

[19] [19]

Wireless end-to- end image transmission system using semantic communications,

M. U. Lokumarambage, V . S. S. Gowrisetty, H. Rezaei, T. Sivalingam, N. Rajatheva, and A. Fernando, “Wireless end-to- end image transmission system using semantic communications,” IEEE Access, vol. 11, pp. 37 149–37 163, 2023

work page 2023

[20] [20]

Take a close look at mode collapse and vanishing gradient in gan,

Z. Ding, S. Jiang, and J. Zhao, “Take a close look at mode collapse and vanishing gradient in gan,” in2022 IEEE 2nd International Conference on Electronic Technology, Communication and Infor- mation (ICETCI), 2022, pp. 597–602

work page 2022

[21] [21]

Agent-driven generative semantic communication with cross-modality and prediction,

W. Yang, Z. Xiong, Y . Yuan, W. Jiang, T. Q. S. Quek, and M. Debbah, “Agent-driven generative semantic communication with cross-modality and prediction,”IEEE Transactions on Wireless Communications, vol. 24, no. 3, pp. 2233–2248, 2025

work page 2025

[22] [22]

SG2SC: A Generative Semantic Communication Framework for Scene Understanding-Oriented Image Transmission,

M. Yang, D. Gao, F. Xie, J. Li, X. Song, and G. Shi, “SG2SC: A Generative Semantic Communication Framework for Scene Understanding-Oriented Image Transmission,” inICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024, pp. 13 486–13 490

work page 2024

[23] [23]

Lightweight diffusion models for resource-constrained semantic communication,

E. Grassucci, G. Pignata, G. Cicchetti, and D. Comminiello, “Lightweight diffusion models for resource-constrained semantic communication,”IEEE Wireless Communications Letters, vol. 14, no. 9, pp. 2743–2747, 2025

work page 2025

[24] [24]

Transmit what you need: Task-adaptive semantic communications for visual information,

J. Park and S. W. Yoon, “Transmit what you need: Task-adaptive semantic communications for visual information,”IEEE Journal on Selected Areas in Communications, vol. 43, no. 12, pp. 4182–4197, 2025

work page 2025

[25] [25]

The partial differential equation ut+ uux=µxx,

E. Hopf, “The partial differential equation ut+ uux=µxx,”Com- munications on Pure and Applied Mathematics, vol. 3, no. 3, pp. 201–230, 1950

work page 1950

[26] [26]

On a quasi-linear parabolic equation occurring in aerodynamics,

J. D. Cole, “On a quasi-linear parabolic equation occurring in aerodynamics,”Quarterly of applied mathematics, vol. 9, no. 3, pp. 225–236, 1951

work page 1951

[27] [27]

On the relation between optimal transport and schr ¨odinger bridges: A stochastic control viewpoint,

Y . Chen, T. T. Georgiou, and M. Pavon, “On the relation between optimal transport and schr ¨odinger bridges: A stochastic control viewpoint,”Journal of Optimization Theory and Applications, vol. 169, no. 2, pp. 671–691, 2016

work page 2016

[28] [28]

Diffusion schr¨odinger bridge with applications to score-based generative modeling,

V . De Bortoli, J. Thornton, J. Heng, and A. Doucet, “Diffusion schr¨odinger bridge with applications to score-based generative modeling,”Advances in neural information processing systems, vol. 34, pp. 17 695–17 709, 2021

work page 2021

[29] [29]

Likelihood Training of Schr¨odinger Bridge using Forward-Backward SDEs Theory,

T. Chen, G.-H. Liu, and E. Theodorou, “Likelihood Training of Schr¨odinger Bridge using Forward-Backward SDEs Theory,” in International Conference on Learning Representations, 2022

work page 2022

[30] [30]

Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge,

C. Yue, Z. Peng, J. Ma, S. Du, P. Wei, and D. Zhang, “Image Restoration Through Generalized Ornstein-Uhlenbeck Bridge,” in Proceedings of the 41st International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, vol. 235. PMLR, 2024, pp. 58 068–58 089

work page 2024

[31] [31]

Denoising diffusion bridge models,

L. Zhou, A. Lou, S. Khanna, and S. Ermon, “Denoising diffusion bridge models,” inThe Twelfth International Conference on Learn- ing Representations, 2024

work page 2024

[32] [32]

UniDB: A unified diffusion bridge framework via stochastic opti- mal control,

K. Zhu, M. Pan, Y . Ma, Y . Fu, J. Yu, J. Wang, and Y . Shi, “UniDB: A unified diffusion bridge framework via stochastic opti- mal control,” inForty-second International Conference on Machine Learning, 2025

work page 2025

[33] [33]

An intuitive proof of the data processing inequality,

N. J. Beaudry and R. Renner, “An intuitive proof of the data processing inequality,”Quantum Information and Computation, vol. 12, no. 5&6, pp. 432–441, 2012

work page 2012

[34] [34]

A class of wasserstein metrics for probability distributions

C. R. Givens and R. M. Shortt, “A class of wasserstein metrics for probability distributions.”Michigan Mathematical Journal, vol. 31, no. 2, pp. 231–240, 1984

work page 1984

[35] [35]

I2sb: image-to-image schr ¨odinger bridge,

G.-H. Liu, A. Vahdat, D.-A. Huang, E. A. Theodorou, W. Nie, and A. Anandkumar, “I2sb: image-to-image schr ¨odinger bridge,” inProceedings of the 40th International Conference on Machine Learning, 2023, pp. 22 042–22 062

work page 2023

[36] [36]

Consistency models,

Y . Song, P. Dhariwal, M. Chen, and I. Sutskever, “Consistency models,” inProceedings of the 40th International Conference on Machine Learning, 2023, pp. 32 211–32 252

work page 2023

[37] [37]

Imagenet: A large-scale hierarchical image database,

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255

work page 2009

[38] [38]

Multiscale structural similarity for image quality assessment,

Z. Wang, E. Simoncelli, and A. Bovik, “Multiscale structural similarity for image quality assessment,” inThe Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, vol. 2, 2003, pp. 1398–1402 V ol.2

work page 2003

[39] [39]

The unreasonable effectiveness of deep features as a perceptual metric,

R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 586–595

work page 2018

[40] [40]

Gans trained by a two time-scale update rule converge to a local nash equilibrium,

M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochre- iter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,”Advances in neural information processing systems, vol. 30, 2017

work page 2017

[41] [41]

High perceptual quality wireless image delivery with denoising diffusion models,

S. F. Yilmaz, X. Niu, B. Bai, W. Han, L. Deng, and D. G ¨und¨uz, “High perceptual quality wireless image delivery with denoising diffusion models,” inIEEE INFOCOM 2024 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), 2024, pp. 1–5

work page 2024

[42] [42]

Some inequalities satisfied by the quantities of information of fisher and shannon,

A. J. Stam, “Some inequalities satisfied by the quantities of information of fisher and shannon,”Information and Control, vol. 2, no. 2, pp. 101–112, 1959

work page 1959

[43] [43]

Information and the accuracy attainable in the estimation of statistical parameters,

C. R. Raoet al., “Information and the accuracy attainable in the estimation of statistical parameters,”Bull. Calcutta Math. Soc, vol. 37, no. 3, pp. 81–91, 1945

work page 1945

[44] [44]

The numerical solution of stochastic differential equations,

P. E. Kloeden and R. Pearson, “The numerical solution of stochastic differential equations,”The ANZIAM Journal, vol. 20, no. 1, pp. 8– 12, 1977

work page 1977

[45] [45]

Note on the derivatives with respect to a parameter of the solutions of a system of differential equations,

T. H. Gronwall, “Note on the derivatives with respect to a parameter of the solutions of a system of differential equations,”Annals of Mathematics, vol. 20, no. 4, pp. 292–296, 1919

work page 1919