Lightweight Low-SNR-Robust Semantic Communication System for Autonomous Driving
Pith reviewed 2026-05-10 00:02 UTC · model grok-4.3
The pith
A pruned deep JSCC model for vehicle image sharing keeps reconstruction quality and low-SNR robustness after removing over half its parameters.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The proposed lightweight low-SNR-robust system implements structured pruning on a deep joint source-channel coding model using batch normalization scaling factors and L1 regularization, combined with uniform quantization and M-QAM modulation handled by a training-deployment separation strategy. This yields a model with more than half the parameters removed that maintains comparable image reconstruction performance and exhibits clear advantages over traditional communication methods under low SNR conditions, as shown in Cityscapes dataset experiments.
What carries the argument
Structured pruning based on batch normalization layer scaling factors and L1 regularization, which reduces model complexity while preserving image reconstruction quality in the JSCC semantic communication framework.
If this is right
- The pruned model fits on resource-constrained vehicle terminals while supporting collaborative perception tasks.
- M-QAM compatibility allows direct integration into existing digital wireless systems used in automotive applications.
- Low-SNR robustness reduces the risk of abrupt failures in image sharing during poor channel conditions.
- Over 50 percent parameter reduction lowers memory and compute demands without sacrificing reconstruction quality.
Where Pith is reading between the lines
- The pruning method based on scaling factors could extend to other sensor modalities such as LiDAR data in multi-vehicle perception networks.
- Dynamic adjustment of the pruning threshold during operation might adapt the model size to instantaneous channel quality.
- The separation strategy offers a template for incorporating non-differentiable digital operations into other end-to-end trained communication systems.
Load-bearing premise
The training-deployment separation strategy handles the non-differentiable quantization without causing major performance loss when channels vary over time.
What would settle it
A live vehicle-to-vehicle field test measuring whether the deployed pruned model shows lower PSNR or perception accuracy than the full model under fluctuating real-world SNR conditions.
Figures
read the original abstract
Image transmission for vehicle-to-vehicle collaborative perception in autonomous driving faces challenges including limited on-board terminal resources, time-varying wireless channel fading, and poor robustness under low signal-to-noise (SNR) ratio. Traditional separate source-channel coding schemes suffer from the cliff effect, while existing semantic communication models are limited by large parameter sizes and weak digital compatibility. This paper proposes a lightweight, low-SNR-robust deep joint source-channel coding (JSCC) semantic communication system. First, structured pruning is implemented based on batch normalization layer scaling factors and L1 regularization, which significantly reduces model complexity while ensuring image reconstruction quality. Second, a uniform quantization and M-QAM modulation scheme adapted to JSCC features is designed, and a training-deployment separation strategy is adopted to address the non-differentiable quantization problem, enabling compatibility with existing digital communication systems. Simulation results on the Cityscapes dataset show that the pruned model maintains comparable performance and robustness to the original one, even with over half of its parameters removed. Notably, the proposed scheme exhibits significant advantages over conventional communication methods under low SNR conditions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a lightweight deep joint source-channel coding (JSCC) semantic communication system for vehicle-to-vehicle image transmission in autonomous driving. It incorporates structured pruning using batch normalization scaling factors and L1 regularization to reduce model size by over 50%, along with a uniform quantization and M-QAM modulation scheme using a training-deployment separation strategy to ensure digital compatibility. Simulations on the Cityscapes dataset are used to claim that the pruned model maintains comparable reconstruction performance and robustness, with significant advantages over conventional communication methods under low SNR conditions.
Significance. If the performance claims hold under more application-relevant evaluations, this work could contribute to the development of efficient semantic communication systems suitable for resource-limited autonomous vehicles operating in challenging wireless environments. The combination of model compression and digital modulation compatibility addresses practical deployment issues in semantic communications.
major comments (2)
- [Simulation results] The evaluation is based solely on image reconstruction metrics such as PSNR and SSIM on the Cityscapes dataset. However, for the claimed application in autonomous driving collaborative perception, downstream task performance metrics (e.g., mean Average Precision for object detection or Intersection over Union for segmentation) are required to verify that semantic content survives the channel noise and quantization. Without these, the assertion of 'significant advantages' for autonomous driving lacks direct support.
- [Proposed method (quantization)] The training-deployment separation strategy is introduced to handle the non-differentiable quantization during training. However, no quantitative analysis, ablation studies, or results are provided to demonstrate that this strategy does not introduce significant performance degradation in time-varying channels, which is central to the low-SNR robustness claim.
minor comments (1)
- [Abstract] The abstract mentions 'simulation results' but does not specify the exact metrics, baselines, or how low SNR is defined, which would improve clarity.
Simulated Author's Rebuttal
We are grateful for the referee's thorough review and constructive feedback. Below, we provide point-by-point responses to the major comments and outline the revisions we will make to address them.
read point-by-point responses
-
Referee: [Simulation results] The evaluation is based solely on image reconstruction metrics such as PSNR and SSIM on the Cityscapes dataset. However, for the claimed application in autonomous driving collaborative perception, downstream task performance metrics (e.g., mean Average Precision for object detection or Intersection over Union for segmentation) are required to verify that semantic content survives the channel noise and quantization. Without these, the assertion of 'significant advantages' for autonomous driving lacks direct support.
Authors: We agree that evaluating downstream tasks is important to fully validate the benefits for autonomous driving. Although our manuscript emphasizes image reconstruction as an indicator of semantic fidelity, we will enhance the experimental section by incorporating object detection performance metrics. Specifically, we will apply a standard object detector to the reconstructed images and report mAP values across SNR levels, comparing our method to baselines. This addition will provide direct evidence for the semantic preservation under low SNR and quantization. revision: yes
-
Referee: [Proposed method (quantization)] The training-deployment separation strategy is introduced to handle the non-differentiable quantization during training. However, no quantitative analysis, ablation studies, or results are provided to demonstrate that this strategy does not introduce significant performance degradation in time-varying channels, which is central to the low-SNR robustness claim.
Authors: The separation strategy decouples the training (using straight-through estimator or similar for differentiability) from deployment (using actual quantization). We recognize that additional validation is needed. In the revised manuscript, we will present ablation studies showing the performance impact of this strategy. This includes comparisons in both static and time-varying channels (e.g., using Jakes' model for Doppler effects) at low SNR, with metrics like PSNR to quantify any degradation. These results will confirm the robustness of the approach. revision: yes
Circularity Check
No significant circularity; results are empirical simulation outcomes
full rationale
The paper presents a proposed lightweight JSCC architecture with structured pruning via batch-norm scaling factors plus L1 regularization, followed by a uniform quantization/M-QAM scheme and a training-deployment separation strategy to handle non-differentiability. All performance claims rest on direct Cityscapes simulations that compare reconstruction metrics (implicitly PSNR/SSIM) against conventional baselines at varying SNRs. No derivation chain exists that reduces a claimed prediction or first-principles result back to its own fitted inputs or self-definitions; the methods are described as explicit design choices whose outputs are measured rather than algebraically forced. Any self-citations are peripheral and not load-bearing for the central empirical findings.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The Cityscapes dataset is representative of real-world autonomous driving scenes for evaluating image reconstruction quality.
- standard math Wireless channels can be modeled with time-varying fading and additive noise for low SNR conditions.
Reference graph
Works this paper leans on
-
[1]
Collaborative computation in integrated sensing, communication, and computation system for autonomous driving,
R. Ren, J. Zhao, D. Zou, Q. Zhang, D. Wang, and W. Xu, “Collaborative computation in integrated sensing, communication, and computation system for autonomous driving,” IEEE Transactions on Intelligent Transportation Systems, vol. 27, no. 1, pp. 883–894, Jan. 2026
2026
-
[2]
IoV-oriented integrated sensing, computation, and communication: System design and resource allocation,
J. Zhao, R. Ren, D. Zou, Q. Zhang, and W. Xu, “IoV-oriented integrated sensing, computation, and communication: System design and resource allocation,” IEEE Transactions on V ehicular Technology, vol. 73, no. 11, pp. 16283–16294, Nov. 2024. 9
2024
-
[3]
UA V-assisted collaborative sensing task offloading and resource allocation in IoV,
R. Ren, J. Zhao, and Q. Zhang, “UA V-assisted collaborative sensing task offloading and resource allocation in IoV,” IEEE Transactions on V ehicular Technology, vol. 75, no. 4, pp. 6806–6815, Apr. 2026
2026
-
[4]
V2V cooperative perception with adaptive communi- cation loss for autonomous driving,
J. Shi and et al., “V2V cooperative perception with adaptive communi- cation loss for autonomous driving,” IEEE Transactions on Intelligent Transportation Systems, vol. 26, no. 10, pp. 14866–14878, Oct. 2025
2025
-
[5]
Jamming and eavesdropping defense scheme based on deep reinforcement learning in autonomous vehicle networks,
Y . Y ao and et al., “Jamming and eavesdropping defense scheme based on deep reinforcement learning in autonomous vehicle networks,” IEEE Transactions on Information F orensics and Security , vol. 18, pp. 1211– 1224, Jan. 2023
2023
-
[6]
Communication beyond transmitting bits: Semantics-guided source and channel coding,
J. Dai, P . Zhang, K. Niu, S. Wang, Z. Si, and X. Qin, “Communication beyond transmitting bits: Semantics-guided source and channel coding,” IEEE Wireless Communications, vol. 30, no. 4, pp. 170–177, Aug. 2023
2023
-
[7]
A Survey on Robust Deep Joint Source-Channel Coding for Semantic Communications
E. Hong, T. Park, and Y . Kim, “A survey on robust deep joint source-channel coding for semantic communications,” arXiv e-prints , p. arXiv:2604.04413, Apr. 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[8]
Semantic communi- cation for edge intelligence enabled autonomous driving system,
Y . Feng, H. Shen, Z. Shan, Q. Y ang, and X. Shi, “Semantic communi- cation for edge intelligence enabled autonomous driving system,” IEEE Network, vol. 39, no .2, pp. 149–157, Mar. 2025
2025
-
[9]
Deep joint source- channel coding for wireless image transmission,
E. Bourtsoulatze, D. Burth Kurka, and D. Gündüz, “Deep joint source- channel coding for wireless image transmission,” IEEE Transactions on Cognitive Communications and Networking , vol. 5, no. 3, pp. 567–579, May 2019
2019
-
[10]
Demo: Real-time semantic communications with a vision transformer,
H. Y oo and et al., “Demo: Real-time semantic communications with a vision transformer,” in 2022 IEEE International Conference on Commu- nications Workshops (ICC Workshops) , pp. 1–2, 2022
2022
-
[11]
Image segmentation semantic communica- tion over internet of vehicles,
Q. Pan, H. Tong, and et al., “Image segmentation semantic communica- tion over internet of vehicles,” in 2023 IEEE Wireless Communications and Networking Conference (WCNC) , pp. 1–6, 2023
2023
-
[12]
An artificial intelligent-driven semantic com- munication framework for connected autonomous vehicular network,
A. Deb Raha and et al., “An artificial intelligent-driven semantic com- munication framework for connected autonomous vehicular network,” in 2023 International Conference on Information Networking (ICOIN) , pp. 352–357, 2023
2023
-
[13]
Semantic communication for the internet of vehicles: A multiuser cooperative approach,
W. Xu, Y . Zhang, and et al., “Semantic communication for the internet of vehicles: A multiuser cooperative approach,” IEEE V ehicular Technology Magazine, vol. 18, pp. 100–109, Mar. 2023
2023
-
[14]
Deep joint source-channel coding for wireless image transmission with adaptive rate control,
M. Y ang and H.-S. Kim, “Deep joint source-channel coding for wireless image transmission with adaptive rate control,” in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 5193–5197, 2022
2022
-
[15]
Predictive and adaptive deep coding for wireless image transmission in semantic communication,
W. Zhang, H. Zhang, and et al., “Predictive and adaptive deep coding for wireless image transmission in semantic communication,” IEEE Transactions on Wireless Communications , vol. 22, no. 8, pp. 5486– 5501, 2023
2023
-
[16]
DRL beamforming in RIS-aided IoV for integrated-sensing-communication-computation,
R. Ren, J. Zhao, Q. Zhang, D. Wang, and J. Li, “DRL beamforming in RIS-aided IoV for integrated-sensing-communication-computation,” IEEE Internet of Things Journal , vol. 12, no .14, pp. 28201–28213, Jul. 2025
2025
-
[17]
Learning based joint coding- modulation for digital semantic communication systems,
Y . Bo, Y . Duan, S. Shao, and M. Tao, “Learning based joint coding- modulation for digital semantic communication systems,” in 2022 14th International Conference on Wireless Communications and Signal Pro- cessing (WCSP) , pp. 1–6, 2022
2022
-
[18]
Joint coding-modulation for digital semantic communications via variational autoencoder,
Y . Bo, Y . Duan, and et al., “Joint coding-modulation for digital semantic communications via variational autoencoder,” IEEE Transactions on Communications, vol. 72, no. 9, pp. 5626–5640, 2024
2024
-
[19]
Image quality assess- ment: from error visibility to structural similarity,
Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assess- ment: from error visibility to structural similarity,” IEEE Transactions on Image Processing , vol. 13, no. 4, pp. 600–612, Apr. 2004
2004
-
[20]
DepGraph: Towards any structural pruning,
G. Fang and et al., “DepGraph: Towards any structural pruning,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16091–16101, 2023
2023
-
[21]
Learning efficient convolutional networks through network slimming,
Z. Liu and et al., “Learning efficient convolutional networks through network slimming,” 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2755–2763, 2017
2017
-
[22]
The cityscapes dataset for semantic urban scene understanding,
M. Cordts and et al., “The cityscapes dataset for semantic urban scene understanding,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pp. 3213–3223, 2016
2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.