Generative Semantic Communication via Alternating Dual-Domain Posterior Sampling
Pith reviewed 2026-05-10 07:30 UTC · model grok-4.3
The pith
Treating semantic decoding as a Bayesian inverse problem shows that posterior sampling preserves the data distribution to reach optimal perceptual quality in wireless image transmission.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We formulate semantic decoding as a Bayesian inverse problem and prove that posterior sampling achieves optimal perceptual quality by preserving the data distribution. Building on this insight, we propose alternating dual-domain posterior sampling (ADDPS), a diffusion-based SemCom receiver that alternately enforces latent-domain and image-domain consistency during the sampling process. This alternating strategy decomposes joint posterior sampling into simpler subproblems, avoiding gradient conflicts while retaining the complementary strengths of both domains.
What carries the argument
Alternating dual-domain posterior sampling (ADDPS), which switches between latent-domain and image-domain consistency checks inside a diffusion sampler to solve the Bayesian inverse problem for semantic decoding.
If this is right
- Posterior sampling recovers the full statistical properties of the source images rather than collapsing to a single most-likely reconstruction.
- ADDPS avoids the overconfident pseudo-posterior that arises when latent and image domain guidance are applied simultaneously.
- The receiver attains higher perceptual quality than MAP-based or single-domain diffusion receivers on standard face-image benchmarks.
- Pretrained generative priors can be used in semantic communication without the distribution mismatch that limits MAP estimators.
Where Pith is reading between the lines
- If the alternating decomposition works across channel conditions, future semantic systems could transmit under higher noise levels while still recovering natural image statistics.
- The same alternating dual-domain idea might extend to other generative inverse problems where both compressed and pixel-space guidance are available, such as video or depth reconstruction.
- Testing the method on non-face image categories would show whether the perceptual gains depend on the specific statistics of the FFHQ training distribution.
Load-bearing premise
Alternating enforcement of latent-domain and image-domain consistency decomposes the joint posterior into simpler subproblems without introducing gradient conflicts or distribution mismatches.
What would settle it
If a non-alternating joint guidance method or a pure MAP estimator on the same diffusion backbone yields equal or higher perceptual scores than ADDPS on the FFHQ test set, the claimed advantage of posterior sampling and the alternating decomposition would be falsified.
Figures
read the original abstract
Generative semantic communication (SemCom) harnesses pretrained generative priors to improve the perceptual quality of wireless image transmission. Existing generative SemCom receivers, however, rely on maximum a posteriori (MAP) estimation, which fundamentally cannot preserve the data distribution and thus limits achievable perceptual quality. Moreover, current diffusion-based approaches using single-domain guidance face significant limitations: latent-domain guidance is sensitive to channel noise, while image-domain guidance inherits decoder bias. Simply combining both domains simultaneously yields an overconfident pseudo-posterior. In this paper, we formulate semantic decoding as a Bayesian inverse problem and prove that posterior sampling achieves optimal perceptual quality by preserving the data distribution. Building on this insight, we propose alternating dual-domain posterior sampling (ADDPS), a diffusion-based SemCom receiver that alternately enforces latent-domain and image-domain consistency during the sampling process. This alternating strategy decomposes joint posterior sampling into simpler subproblems, avoiding gradient conflicts while retaining the complementary strengths of both domains. Experiments on FFHQ demonstrate that the proposed ADDPS achieves superior perceptual quality compared with existing methods.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper formulates semantic decoding as a Bayesian inverse problem and asserts a proof that posterior sampling achieves optimal perceptual quality by preserving the data distribution. It proposes Alternating Dual-Domain Posterior Sampling (ADDPS), a diffusion-based receiver that alternates latent-domain and image-domain consistency enforcement during sampling to decompose the joint posterior into simpler subproblems while avoiding gradient conflicts. Experiments on FFHQ are claimed to show superior perceptual quality over existing generative SemCom methods.
Significance. If the optimality proof holds and the alternating procedure is shown to preserve the correct joint posterior, the work would offer a theoretically grounded advance over MAP-based receivers in generative semantic communication. It directly addresses limitations of single-domain guidance in diffusion models and could improve perceptual quality in wireless image transmission systems that leverage pretrained generative priors.
major comments (3)
- [Abstract / §3] Abstract and §3 (Optimality Proof): The claim that posterior sampling is proven optimal for perceptual quality by preserving the data distribution is asserted without any equations, assumptions on the generative prior or channel, or error analysis visible in the abstract. The full derivation must be supplied with explicit statements of the perceptual quality metric and conditions under which the stationary distribution is preserved.
- [§4] §4 (ADDPS Algorithm): The alternating dual-domain consistency steps are stated to decompose joint posterior sampling without introducing gradient conflicts or distribution mismatches. No derivation or invariance argument is referenced showing that sequential latent-domain and image-domain score updates leave the stationary distribution equal to the true joint p(x | y, z); sequential application of domain-specific conditionals can converge to a different invariant measure, directly threatening transfer of the general optimality result.
- [Experiments] Experiments section: The abstract states that ADDPS achieves superior perceptual quality on FFHQ, yet supplies no quantitative metrics (FID, LPIPS, etc.), baselines, number of trials, or statistical details. These must be reported with error bars and ablation studies on the alternating schedule to substantiate the superiority claim.
minor comments (1)
- Define all acronyms (SemCom, ADDPS, MAP) on first use and ensure consistent notation for the latent variable z and observation y across sections.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The comments highlight important areas for improving clarity in the theoretical sections and experimental reporting. We address each major comment below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract / §3] Abstract and §3 (Optimality Proof): The claim that posterior sampling is proven optimal for perceptual quality by preserving the data distribution is asserted without any equations, assumptions on the generative prior or channel, or error analysis visible in the abstract. The full derivation must be supplied with explicit statements of the perceptual quality metric and conditions under which the stationary distribution is preserved.
Authors: We agree that the abstract is necessarily concise and omits the detailed derivation. The full proof appears in §3, but we will revise the abstract to include a high-level statement of the result along with the key assumptions. In the revised §3, we will explicitly state the perceptual quality metric (the expected value of a perceptual distance function integrated against the posterior), the assumptions (the generative prior is a pretrained diffusion model whose stationary distribution matches the data distribution p(x), and the channel is an additive white Gaussian noise model), and the conditions under which posterior sampling preserves the stationary distribution. A brief error analysis for the finite-step diffusion approximation will also be added. revision: yes
-
Referee: [§4] §4 (ADDPS Algorithm): The alternating dual-domain consistency steps are stated to decompose joint posterior sampling without introducing gradient conflicts or distribution mismatches. No derivation or invariance argument is referenced showing that sequential latent-domain and image-domain score updates leave the stationary distribution equal to the true joint p(x | y, z); sequential application of domain-specific conditionals can converge to a different invariant measure, directly threatening transfer of the general optimality result.
Authors: This concern about the invariance of the alternating procedure is well-taken. In the revision we will expand §4 with a dedicated invariance argument. We will show that the alternating updates implement a form of coordinate-wise score matching on the joint posterior, where each half-step conditions on the current sample from the complementary domain. Under the Lipschitz continuity of the score functions and the contractive nature of the diffusion reverse process, the overall Markov chain converges to the correct joint stationary distribution p(x | y, z). We will also reference related results from alternating MCMC and score-based sampling literature to support the argument. revision: yes
-
Referee: [Experiments] Experiments section: The abstract states that ADDPS achieves superior perceptual quality on FFHQ, yet supplies no quantitative metrics (FID, LPIPS, etc.), baselines, number of trials, or statistical details. These must be reported with error bars and ablation studies on the alternating schedule to substantiate the superiority claim.
Authors: The full manuscript already contains quantitative results (FID, LPIPS, PSNR) comparing ADDPS against MAP-based and single-domain diffusion baselines on FFHQ. To satisfy the request, we will make these tables and figures more prominent, add error bars computed over 10 independent trials, report the exact test-set size and random seeds, and include a new ablation subsection varying the alternating schedule frequency. Statistical significance tests will be added where appropriate. revision: partial
Circularity Check
No significant circularity detected in the derivation chain.
full rationale
The paper states that semantic decoding is formulated as a Bayesian inverse problem and that posterior sampling preserves the data distribution for optimal perceptual quality. This is a standard property of sampling from the posterior and does not reduce to the ADDPS procedure or any fitted parameter by construction. The alternating dual-domain strategy is introduced as an implementation choice to approximate the joint posterior, with no equations or self-citations shown that make the optimality claim equivalent to the method's own inputs. The derivation remains independent of the proposed receiver and relies on general Bayesian principles rather than self-referential fitting or unverified uniqueness theorems.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Posterior sampling from the data distribution achieves optimal perceptual quality for generative reconstruction
Reference graph
Works this paper leans on
-
[1]
Beyond transmitting bits: Context, semantics, and task-oriented communications,
D. G ¨und¨uz, Z. Qin, I. E. Aguerri, H. S. Dhillon, Z. Yang, A. Yener, K. Wong, and C. Chae, “Beyond transmitting bits: Context, semantics, and task-oriented communications,”IEEE J. Sel. Areas Commun., vol. 41, no. 1, pp. 5–41, 2023
work page 2023
-
[2]
Semantic communications: Principles and challenges,
Z. Qin, X. Tao, J. Lu, and G. Y . Li, “Semantic communications: Principles and challenges,”arXiv:2201.01389, 2022
-
[3]
Deep joint source- channel coding for wireless image transmission,
E. Bourtsoulatze, D. B. Kurka, and D. G ¨und¨uz, “Deep joint source- channel coding for wireless image transmission,”IEEE Trans. Cogn. Commun. Netw., vol. 5, no. 3, pp. 567–579, 2019
work page 2019
-
[4]
SwinJSCC: Taming swin transformer for deep joint source-channel coding,
K. Yang, S. Wang, J. Dai, X. Qin, K. Niu, and P. Zhang, “SwinJSCC: Taming swin transformer for deep joint source-channel coding,”IEEE Trans. Cogn. Commun. Netw., vol. 11, no. 1, pp. 90–104, 2025
work page 2025
-
[5]
Contrastive learning-based semantic communications,
S. Tang, Q. Yang, L. Fan, X. Lei, A. Nallanathan, and G. K. Kara- giannidis, “Contrastive learning-based semantic communications,”IEEE Trans. Commun., vol. 72, no. 10, pp. 6328–6343, 2024
work page 2024
-
[6]
SNR-adaptive deep joint source- channel coding for wireless image transmission,
M. Ding, J. Li, M. Ma, and X. Fan, “SNR-adaptive deep joint source- channel coding for wireless image transmission,” inIEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), 2021, pp. 1555–1559
work page 2021
-
[7]
The perception-distortion tradeoff,
Y . Blau and T. Michaeli, “The perception-distortion tradeoff,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 6228–6237
work page 2018
-
[8]
Rethinking lossy compression: The rate-distortion-perception tradeoff,
——, “Rethinking lossy compression: The rate-distortion-perception tradeoff,” inProf. ICLR. PMLR, 2019, pp. 675–685
work page 2019
-
[9]
Generative AI-enabled semantic communication: State-of-the-art, applications, and the way ahead,
C. Liang and D. Li, “Generative AI-enabled semantic communication: State-of-the-art, applications, and the way ahead,”IEEE Commun. Surv. Tutorials, vol. 28, pp. 3976–4015, 2025
work page 2025
-
[10]
Generative joint source-channel coding for semantic image transmission,
E. Erdemir, T.-Y . Tung, P. L. Dragotti, and D. G ¨und¨uz, “Generative joint source-channel coding for semantic image transmission,”IEEE J. Sel. Areas Commun., vol. 41, no. 8, pp. 2645–2657, 2023
work page 2023
-
[11]
Cache- enabled generative joint source-channel coding for evolving semantic communications,
S. Tang, Q. Yang, J. Park, Z. Zhang, K. Huang, and D. Gunduz, “Cache- enabled generative joint source-channel coding for evolving semantic communications,”arXiv:2603.17702, 2026
-
[12]
Denoising diffusion probabilistic models,
J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” inProc. NeurIPS, vol. 33, 2020, pp. 6840–6851
work page 2020
-
[13]
Score-based generative modeling through stochastic differential equations,
Y . Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differential equations,” inProc. ICLR, 2021
work page 2021
-
[14]
Cddm: Channel denoising diffusion models for wireless semantic communications,
T. Wu, Z. Chen, D. He, L. Qian, Y . Xu, M. Tao, and W. Zhang, “Cddm: Channel denoising diffusion models for wireless semantic communications,”IEEE Trans. Wireless Commun., vol. 23, no. 9, pp. 11 168–11 183, 2024
work page 2024
-
[15]
High perceptual quality wireless image delivery with denoising diffusion models,
S. F. Yilmaz, X. Niu, B. Bai, W. Han, L. Deng, and D. Gunduz, “High perceptual quality wireless image delivery with denoising diffusion models,” inIEEE INFOCOM Workshops, May 2024
work page 2024
-
[16]
Enabling training-free semantic communication systems with generative diffusion models,
S. Tang, Y . Jia, Q. Yang, R. Zhang, J. Park, and D. Niyato, “Enabling training-free semantic communication systems with generative diffusion models,”Proc. Globecom, 2025
work page 2025
-
[17]
Diffcom: Channel received signal is a natural condition to guide diffusion posterior sampling,
S. Wang, J. Dai, K. Tan, X. Qin, K. Niu, and P. Zhang, “Diffcom: Channel received signal is a natural condition to guide diffusion posterior sampling,”IEEE J. Sel. Areas Commun., vol. 43, no. 7, pp. 2651–2666, 2025
work page 2025
-
[18]
Diffusion posterior sampling for general noisy inverse problems,
H. Chung, J. Kim, M. T. Mccann, M. L. Klasky, and J. C. Ye, “Diffusion posterior sampling for general noisy inverse problems,” inInt’l. Conf. on Learn. Repr ., ICLR, 2023
work page 2023
-
[19]
A connection between score matching and denoising autoencoders,
P. Vincent, “A connection between score matching and denoising autoencoders,”Neural Computation, vol. 23, no. 7, pp. 1661–1674, 2011
work page 2011
-
[20]
The unreasonable effectiveness of deep features as a perceptual metric,
R. Zhang, P. Isola, A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in IEEE/CVF Int. Comp. Vis. Pattern Recog. (CVPR), 2018, pp. 586–595
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.