When Semantic Communication Meets Queueing: Cross-Layer Latency and Task Fidelity Optimization

Tugba Erpek; Yalin E. Sagduyu

arxiv: 2605.05514 · v1 · submitted 2026-05-06 · 💻 cs.IT · cs.AI· cs.LG· cs.NI· eess.SP· math.IT

When Semantic Communication Meets Queueing: Cross-Layer Latency and Task Fidelity Optimization

Yalin E. Sagduyu , Tugba Erpek This is my paper

Pith reviewed 2026-05-08 15:27 UTC · model grok-4.3

classification 💻 cs.IT cs.AIcs.LGcs.NIeess.SPmath.IT

keywords semantic communicationqueueing theoryage of informationcross-layer optimizationwireless fading channelsautoencoderlatency minimization

0 comments

The pith

Adapting the latent dimension in semantic communication cuts delay and age of information while holding error below a cap.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Semantic communication transmits compact task-oriented representations rather than raw data, using a learned autoencoder to reduce channel resources. This paper models image transmission over block Rayleigh fading where the latent dimension sets both reconstruction and label-prediction accuracy and the time to send each update. Treating that dimension as a tunable semantic rate, the authors derive online controllers that react to queue length and age of information. The controllers keep long-term semantic error below a target yet deliver lower average delay and fresher updates than any fixed-rate choice.

Core claim

By casting the latent dimension as a cross-layer control knob and applying a queue-aware drift-plus-penalty policy together with an age-aware policy, the system minimizes average latency or time-average age of information subject to an average semantic-error constraint and achieves strictly lower delay and AoI than fixed-latent-dimension baselines.

What carries the argument

Multi-task semantic autoencoder whose variable latent dimension (complex channel uses per source sample) is adjusted on-line by drift-plus-penalty and age-aware controllers.

Load-bearing premise

The autoencoder can be trained so that larger latent dimensions reliably improve both image reconstruction and label accuracy over Rayleigh fading, and the service time for each semantic packet grows linearly with the chosen latent dimension.

What would settle it

A simulation or hardware test in which the adaptive policies produce higher average delay or higher average AoI than the best fixed latent dimension while meeting the same long-term semantic-error target would disprove the claimed benefit.

Figures

Figures reproduced from arXiv: 2605.05514 by Tugba Erpek, Yalin E. Sagduyu.

**Figure 1.** Figure 1: SemCom system model. II. SEMANTIC COMMUNICATIONS The system model of SemCom, shown in view at source ↗

**Figure 2.** Figure 2: Accuracy vs. SNR in SemCom. 5 10 15 20 25 30 35 40 45 50 Latent Dimension 0.60 0.65 0.70 0.75 0.80 0.85 0.90 Accuracy SNR = 3dB SNR = 5dB SNR = 10dB view at source ↗

**Figure 3.** Figure 3: Accuracy vs. latent dimension in SemCom. view at source ↗

**Figure 4.** Figure 4: Delay vs. arrival rate. The dynamic policy achieves lower delay than fixed-N baselines by adapting N using the backlog and virtual fidelity queue, selecting smaller N when backlog is low and increasing N only to maintain E¯ ≤ ε. Fixed-N schemes show clear feasibility limits: N = 10 supports only ε = 0.3, N = 15 meets ε = 0.25 but struggles at ε = 0.2, and only N = 20 satisfies ε ∈ {0.2, 0.25, 0.3} at much… view at source ↗

**Figure 5.** Figure 5: AoI vs. arrival rate. queue-aware design of Section II. While the earlier policy is backlog-driven and prioritizes congestion mitigation, the present mechanism is freshness-driven and prioritizes timely semantic updates through the latent-dimension adaptation. E. AoI Performance Under Semantic Fidelity Constraints We next evaluate the AoI-aware semantic rate controller across varying traffic intensities view at source ↗

read the original abstract

Semantic communication (SemCom) with learned encoder-decoder architectures enables end-to-end learning of compact task-oriented representations optimized for the wireless channel, reducing channel resources needed to convey task-relevant information and improving spectrum efficiency. This paper studies semantic image transmission over block Rayleigh fading with AWGN using a multi-task semantic autoencoder that jointly reconstructs images and predicts labels from the received waveform. The latent dimension (complex channel uses per source sample) serves as a cross-layer control variable governing semantic fidelity and channel resource usage. We characterize the resulting latency-task fidelity tradeoff: larger latent representations improve inference accuracy but increase service time, channel uses, and queueing delay. Building on this insight, we develop online semantic-rate controllers that adapt the latent dimension per update under a long-term semantic error constraint. A queue-aware drift-plus-penalty policy minimizes delay subject to an average semantic error cap, while a complementary age-aware policy minimizes time-average Age of Information (AoI). By adapting the semantic rate to congestion and fidelity requirements, the proposed framework improves spectrum utilization and enables timely semantic updates with significantly lower delay and AoI than fixed-rate baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts semantic latent dimension via drift-plus-penalty policies to cut delay and AoI, but the service-time model in Rayleigh fading needs closer scrutiny.

read the letter

The core contribution is a pair of online controllers that treat the latent dimension of a multi-task semantic autoencoder as the decision variable. One uses queue-aware drift-plus-penalty to keep average semantic error below a cap while minimizing delay; the other does the same for Age of Information. Both operate over block Rayleigh fading with a joint image-reconstruction and label-prediction autoencoder. This moves semantic communication from static rate selection into congestion-responsive adaptation, which is a direct extension of earlier fixed-latent-dimension work. The framing of the latency-fidelity tradeoff is clear and the choice of Lyapunov tools is appropriate for the long-term constraints involved. The idea that you can shrink the representation when the queue grows and expand it when the channel is good is practically useful for task-oriented links. The modeling choice for service time is the main soft spot. The paper sets service time proportional to latent dimension times channel uses, but block Rayleigh fading introduces random outages and variable decoding success that are not folded into that formula. If the analysis relies only on average channel statistics or assumes fixed success probability, the optimality claims and the reported gains versus fixed-rate baselines become sensitive to that assumption. The abstract supplies no derivations or numerical results, so the size of the improvement remains unclear until the full simulations are checked. This paper is aimed at researchers who already work on semantic or task-oriented wireless systems and want to add queueing and timeliness metrics. A reader looking for concrete cross-layer policies will find usable ideas here. It is worth sending to peer review so a referee can verify the service-time model and the actual performance numbers.

Referee Report

1 major / 2 minor

Summary. The paper studies semantic image transmission over block Rayleigh fading channels using a multi-task semantic autoencoder that jointly performs image reconstruction and label prediction. The latent dimension serves as a controllable semantic rate that trades off task fidelity against service time, channel uses, and queueing delay. The authors characterize this latency-fidelity tradeoff and propose two online controllers based on the Lyapunov drift-plus-penalty framework: a queue-aware policy that minimizes average delay subject to a long-term semantic error constraint, and an age-aware policy that minimizes time-average AoI. They claim that adapting the semantic rate to congestion and fidelity requirements yields significantly lower delay and AoI than fixed-rate baselines while improving spectrum utilization.

Significance. If the modeling assumptions and derivations hold, the work is significant for providing a principled cross-layer framework that integrates semantic communications with stochastic queueing optimization. The application of drift-plus-penalty to control latent dimension under semantic error constraints offers a concrete method for managing the accuracy-resource tradeoff in task-oriented wireless systems, which could inform low-latency semantic designs in future networks. The explicit use of a multi-task autoencoder over fading channels and the dual policy formulations (delay and AoI) are strengths that extend standard semantic comm results.

major comments (1)

[System model and queueing analysis] In the system model and queueing dynamics section: the service time is defined as latent dimension times channel uses, allowing the MDP state to treat it as a deterministic controllable quantity under the average semantic error constraint. However, in block Rayleigh fading, instantaneous channel realizations affect the decoding success probability of the multi-task autoencoder output; poor fades can cause outages or require retransmissions that extend effective service time. The analysis appears to rely on average channel statistics without folding this variability into the service-time formula or the policy derivation, which directly impacts the claimed optimality and performance gains versus fixed-rate baselines.

minor comments (2)

[Abstract] The abstract states the tradeoff characterization and policy improvements but supplies no derivations, simulation results, or validation details; adding a brief quantitative statement on the reported gains (e.g., percentage reductions in delay/AoI) would strengthen the summary.
[Problem formulation] Notation for the multi-task loss function and the semantic error metric should be introduced earlier and used consistently when stating the long-term constraint in the optimization problems.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive and insightful comments on our manuscript. The feedback on the system modeling assumptions is valuable, and we address it in detail below while clarifying the scope of our analysis.

read point-by-point responses

Referee: [System model and queueing analysis] In the system model and queueing dynamics section: the service time is defined as latent dimension times channel uses, allowing the MDP state to treat it as a deterministic controllable quantity under the average semantic error constraint. However, in block Rayleigh fading, instantaneous channel realizations affect the decoding success probability of the multi-task autoencoder output; poor fades can cause outages or require retransmissions that extend effective service time. The analysis appears to rely on average channel statistics without folding this variability into the service-time formula or the policy derivation, which directly impacts the claimed optimality and performance gains versus fixed-rate baselines.

Authors: We appreciate the referee highlighting this modeling choice. In our framework (Section II), each semantic update is transmitted in a single block using exactly d complex channel uses, where d is the controllable latent dimension. Consequently, the service time is deterministic and proportional to d, as we do not incorporate ARQ or retransmissions; a transmission attempt always occupies the same duration regardless of the instantaneous fade. The effect of block Rayleigh fading is instead incorporated via the long-term average semantic error constraint: for a chosen d, the error probability (joint reconstruction and label prediction) is the expectation over the fading distribution at the operating SNR, and the Lyapunov drift-plus-penalty policy enforces that the time-average error remains below a prescribed threshold. The MDP state therefore treats service time as a deterministic function of the action d, which is consistent with the one-shot semantic transmission model and the average-fidelity objective. This formulation yields policies that are optimal for the defined stochastic optimization problem. We acknowledge that an alternative model with per-realization outages and retransmissions would alter the service-time dynamics and could be a worthwhile extension. To strengthen the manuscript, we have added explicit discussion of these assumptions, their relation to the average-error constraint, and implications for the reported gains versus fixed-rate baselines (revised Section II and new paragraph in the discussion of simulation results). This is a partial revision. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation applies standard queueing optimization to an explicit modeling assumption

full rationale

The paper characterizes the latency-fidelity tradeoff from the explicit modeling choice that service time equals latent dimension times channel uses, then applies the established Lyapunov drift-plus-penalty framework (from external queueing literature) to derive queue-aware and age-aware policies under a long-term semantic error constraint. No equations or steps reduce claimed predictions or optimality results to fitted parameters renamed as outputs, self-definitional loops, or load-bearing self-citations. The reported gains versus fixed-rate baselines follow directly from the policy optimization on the given model rather than from any tautological re-derivation of the inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

Framework rests on standard wireless channel and queueing assumptions plus the learned autoencoder; no new entities postulated.

free parameters (1)

latent dimension
Cross-layer control variable adapted online to trade fidelity for service time.

axioms (2)

domain assumption Block Rayleigh fading with AWGN
Standard model for wireless channel in semantic image transmission.
domain assumption Service time proportional to latent dimension and channel uses
Links semantic rate directly to queueing delay.

pith-pipeline@v0.9.0 · 5513 in / 1162 out tokens · 75685 ms · 2026-05-08T15:27:27.933880+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references

[1]

Beyond transmitting bits: Context, semantics, and task-oriented communications,

D. G ¨und¨uz, Z. Qin, I. E. Aguerri, H. S. Dhillon, Z. Yang, A. Yener, K. K. Wong, and C. S. Chae, “Beyond transmitting bits: Context, semantics, and task-oriented communications,”IEEE Journal on Selected Areas in Communications, vol. 41, no. 1, pp. 5–41, 2023

2023
[2]

Semantic communications in networked systems,

E. Uysal, O. Kaya, A. Ephremides, J. Gross, M. Codreanu, P. Popovski, M. Assaad, G. Liva, A. Munari, T. Soleymani, B. Soret, and H. Johans- son, “Semantic communications in networked systems,”IEEE Network, vol. 36, no. 4, pp. 233–240, 2022

2022
[3]

Will 6G be semantic communications? opportunities and challenges from task-oriented and secure communications to integrated sensing,

Y . E. Sagduyu, T. Erpek, A. Yener, and S. Ulukus, “Will 6G be semantic communications? opportunities and challenges from task-oriented and secure communications to integrated sensing,”IEEE Network, vol. 38, no. 6, pp. 72–80, 2024

2024
[4]

Deep joint source- channel coding for wireless image transmission,

E. Bourtsoulatze, D. B. Kurka, and D. G ¨und¨uz, “Deep joint source- channel coding for wireless image transmission,”IEEE Transactions on Cognitive Communications and Networking, vol. 5, no. 3, pp. 567–579, 2019

2019
[5]

Wireless image re- trieval at the edge,

M. Jankowski, D. G ¨und¨uz, and K. Mikolajczyk, “Wireless image re- trieval at the edge,”IEEE Journal on Selected Areas in Communications, vol. 39, no. 1, pp. 89–100, 2021

2021
[6]

Generative joint source-channel coding for semantic image transmission,

E. Erdemir, T. Tung, P. L. Dragotti, and D. G ¨und¨uz, “Generative joint source-channel coding for semantic image transmission,”IEEE Journal on Selected Areas in Communications, vol. 41, no. 8, pp. 2645–2657, 2023

2023
[7]

Semantic communications for image recovery and classification via deep joint source and channel coding,

Z. Lyu, G. Zhu, J. Xu, B. Ai, and S. Cui, “Semantic communications for image recovery and classification via deep joint source and channel coding,”IEEE Transactions on Wireless Communications, vol. 23, no. 8, pp. 8388–8404, 2024

2024
[8]

Age of information in deep learning-driven task-oriented communications,

Y . E. Sagduyu, T. Erpek, A. Yener, and S. Ulukus, “Age of information in deep learning-driven task-oriented communications,” inIEEE INFO- COM AoI Workshop, 2023

2023
[9]

Low-latency task- oriented communications with multi-round, multi-task deep learning,

Y . E. Sagduyu, T. Erpek, A. Yener, and S. Ulukus, “Low-latency task- oriented communications with multi-round, multi-task deep learning,” in ACM MobiCom Workshop on Machine Learning for NextG Networks, 2024

2024
[10]

On the computing and communication tradeoff in reasoning-based multi-user semantic communications,

N. Singh, C. K. Thomas, W. Saad, and E. C. Strinati, “On the computing and communication tradeoff in reasoning-based multi-user semantic communications,” inIEEE Wireless Communications and Networking Conference (WCNC), 2025

2025
[11]

Energy efficient semantic communication over wireless networks with rate splitting,

Z. Yang, M. Chen, Z. Zhang, and C. Huang, “Energy efficient semantic communication over wireless networks with rate splitting,”IEEE Jour- nal on Selected Areas in Communications, vol. 41, no. 5, pp. 1484–1495, 2023

2023
[12]

Joint sensing and task-oriented communications with image and wireless data modalities for dynamic spectrum access,

Y . E. Sagduyu, T. Erpek, A. Yener, and S. Ulukus, “Joint sensing and task-oriented communications with image and wireless data modalities for dynamic spectrum access,” inIEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), 2024

2024

[1] [1]

Beyond transmitting bits: Context, semantics, and task-oriented communications,

D. G ¨und¨uz, Z. Qin, I. E. Aguerri, H. S. Dhillon, Z. Yang, A. Yener, K. K. Wong, and C. S. Chae, “Beyond transmitting bits: Context, semantics, and task-oriented communications,”IEEE Journal on Selected Areas in Communications, vol. 41, no. 1, pp. 5–41, 2023

2023

[2] [2]

Semantic communications in networked systems,

E. Uysal, O. Kaya, A. Ephremides, J. Gross, M. Codreanu, P. Popovski, M. Assaad, G. Liva, A. Munari, T. Soleymani, B. Soret, and H. Johans- son, “Semantic communications in networked systems,”IEEE Network, vol. 36, no. 4, pp. 233–240, 2022

2022

[3] [3]

Will 6G be semantic communications? opportunities and challenges from task-oriented and secure communications to integrated sensing,

Y . E. Sagduyu, T. Erpek, A. Yener, and S. Ulukus, “Will 6G be semantic communications? opportunities and challenges from task-oriented and secure communications to integrated sensing,”IEEE Network, vol. 38, no. 6, pp. 72–80, 2024

2024

[4] [4]

Deep joint source- channel coding for wireless image transmission,

E. Bourtsoulatze, D. B. Kurka, and D. G ¨und¨uz, “Deep joint source- channel coding for wireless image transmission,”IEEE Transactions on Cognitive Communications and Networking, vol. 5, no. 3, pp. 567–579, 2019

2019

[5] [5]

Wireless image re- trieval at the edge,

M. Jankowski, D. G ¨und¨uz, and K. Mikolajczyk, “Wireless image re- trieval at the edge,”IEEE Journal on Selected Areas in Communications, vol. 39, no. 1, pp. 89–100, 2021

2021

[6] [6]

Generative joint source-channel coding for semantic image transmission,

E. Erdemir, T. Tung, P. L. Dragotti, and D. G ¨und¨uz, “Generative joint source-channel coding for semantic image transmission,”IEEE Journal on Selected Areas in Communications, vol. 41, no. 8, pp. 2645–2657, 2023

2023

[7] [7]

Semantic communications for image recovery and classification via deep joint source and channel coding,

Z. Lyu, G. Zhu, J. Xu, B. Ai, and S. Cui, “Semantic communications for image recovery and classification via deep joint source and channel coding,”IEEE Transactions on Wireless Communications, vol. 23, no. 8, pp. 8388–8404, 2024

2024

[8] [8]

Age of information in deep learning-driven task-oriented communications,

Y . E. Sagduyu, T. Erpek, A. Yener, and S. Ulukus, “Age of information in deep learning-driven task-oriented communications,” inIEEE INFO- COM AoI Workshop, 2023

2023

[9] [9]

Low-latency task- oriented communications with multi-round, multi-task deep learning,

Y . E. Sagduyu, T. Erpek, A. Yener, and S. Ulukus, “Low-latency task- oriented communications with multi-round, multi-task deep learning,” in ACM MobiCom Workshop on Machine Learning for NextG Networks, 2024

2024

[10] [10]

On the computing and communication tradeoff in reasoning-based multi-user semantic communications,

N. Singh, C. K. Thomas, W. Saad, and E. C. Strinati, “On the computing and communication tradeoff in reasoning-based multi-user semantic communications,” inIEEE Wireless Communications and Networking Conference (WCNC), 2025

2025

[11] [11]

Energy efficient semantic communication over wireless networks with rate splitting,

Z. Yang, M. Chen, Z. Zhang, and C. Huang, “Energy efficient semantic communication over wireless networks with rate splitting,”IEEE Jour- nal on Selected Areas in Communications, vol. 41, no. 5, pp. 1484–1495, 2023

2023

[12] [12]

Joint sensing and task-oriented communications with image and wireless data modalities for dynamic spectrum access,

Y . E. Sagduyu, T. Erpek, A. Yener, and S. Ulukus, “Joint sensing and task-oriented communications with image and wireless data modalities for dynamic spectrum access,” inIEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), 2024

2024