pith. sign in

arxiv: 2604.20278 · v1 · submitted 2026-04-22 · 📡 eess.SY · cs.SY

Lightweight Low-SNR-Robust Semantic Communication System for Autonomous Driving

Pith reviewed 2026-05-10 00:02 UTC · model grok-4.3

classification 📡 eess.SY cs.SY
keywords semantic communicationjoint source-channel codingstructured pruningautonomous drivinglow SNR robustnessimage transmissionvehicle-to-vehicledeep learning
0
0 comments X

The pith

A pruned deep JSCC model for vehicle image sharing keeps reconstruction quality and low-SNR robustness after removing over half its parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a semantic communication system for vehicle-to-vehicle image transmission in autonomous driving to overcome limited onboard resources and weak performance at low signal-to-noise ratios. It applies structured pruning to a joint source-channel coding network and introduces a training-deployment separation strategy to enable standard digital modulation. Simulations on urban driving scenes demonstrate that the reduced model matches the original in image quality and channel robustness while outperforming conventional separate source-channel coding at low SNR. The approach targets practical deployment where compute is scarce and wireless conditions fluctuate.

Core claim

The proposed lightweight low-SNR-robust system implements structured pruning on a deep joint source-channel coding model using batch normalization scaling factors and L1 regularization, combined with uniform quantization and M-QAM modulation handled by a training-deployment separation strategy. This yields a model with more than half the parameters removed that maintains comparable image reconstruction performance and exhibits clear advantages over traditional communication methods under low SNR conditions, as shown in Cityscapes dataset experiments.

What carries the argument

Structured pruning based on batch normalization layer scaling factors and L1 regularization, which reduces model complexity while preserving image reconstruction quality in the JSCC semantic communication framework.

If this is right

  • The pruned model fits on resource-constrained vehicle terminals while supporting collaborative perception tasks.
  • M-QAM compatibility allows direct integration into existing digital wireless systems used in automotive applications.
  • Low-SNR robustness reduces the risk of abrupt failures in image sharing during poor channel conditions.
  • Over 50 percent parameter reduction lowers memory and compute demands without sacrificing reconstruction quality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The pruning method based on scaling factors could extend to other sensor modalities such as LiDAR data in multi-vehicle perception networks.
  • Dynamic adjustment of the pruning threshold during operation might adapt the model size to instantaneous channel quality.
  • The separation strategy offers a template for incorporating non-differentiable digital operations into other end-to-end trained communication systems.

Load-bearing premise

The training-deployment separation strategy handles the non-differentiable quantization without causing major performance loss when channels vary over time.

What would settle it

A live vehicle-to-vehicle field test measuring whether the deployed pruned model shows lower PSNR or perception accuracy than the full model under fluctuating real-world SNR conditions.

Figures

Figures reproduced from arXiv: 2604.20278 by Junhui Zhao, Minjie Wei, Ruixing Ren.

Figure 1
Figure 1. Figure 1: Block Diagram of V2V Semantic Communication [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Lightweight Low-SNR-Robust Semantic Communication Scheme. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of the performance of the proposed scheme under different pruning ratios with the traditional BPG-LDPC [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of the performance of the proposed deep JSCC when pruning ratio is set to [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Examples of reconstructed images generated from the proposed scheme and the baseline digital scheme, with the [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of performance between two different compressed models, where the results of (a) and (b) were obtained [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
read the original abstract

Image transmission for vehicle-to-vehicle collaborative perception in autonomous driving faces challenges including limited on-board terminal resources, time-varying wireless channel fading, and poor robustness under low signal-to-noise (SNR) ratio. Traditional separate source-channel coding schemes suffer from the cliff effect, while existing semantic communication models are limited by large parameter sizes and weak digital compatibility. This paper proposes a lightweight, low-SNR-robust deep joint source-channel coding (JSCC) semantic communication system. First, structured pruning is implemented based on batch normalization layer scaling factors and L1 regularization, which significantly reduces model complexity while ensuring image reconstruction quality. Second, a uniform quantization and M-QAM modulation scheme adapted to JSCC features is designed, and a training-deployment separation strategy is adopted to address the non-differentiable quantization problem, enabling compatibility with existing digital communication systems. Simulation results on the Cityscapes dataset show that the pruned model maintains comparable performance and robustness to the original one, even with over half of its parameters removed. Notably, the proposed scheme exhibits significant advantages over conventional communication methods under low SNR conditions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a lightweight deep joint source-channel coding (JSCC) semantic communication system for vehicle-to-vehicle image transmission in autonomous driving. It incorporates structured pruning using batch normalization scaling factors and L1 regularization to reduce model size by over 50%, along with a uniform quantization and M-QAM modulation scheme using a training-deployment separation strategy to ensure digital compatibility. Simulations on the Cityscapes dataset are used to claim that the pruned model maintains comparable reconstruction performance and robustness, with significant advantages over conventional communication methods under low SNR conditions.

Significance. If the performance claims hold under more application-relevant evaluations, this work could contribute to the development of efficient semantic communication systems suitable for resource-limited autonomous vehicles operating in challenging wireless environments. The combination of model compression and digital modulation compatibility addresses practical deployment issues in semantic communications.

major comments (2)
  1. [Simulation results] The evaluation is based solely on image reconstruction metrics such as PSNR and SSIM on the Cityscapes dataset. However, for the claimed application in autonomous driving collaborative perception, downstream task performance metrics (e.g., mean Average Precision for object detection or Intersection over Union for segmentation) are required to verify that semantic content survives the channel noise and quantization. Without these, the assertion of 'significant advantages' for autonomous driving lacks direct support.
  2. [Proposed method (quantization)] The training-deployment separation strategy is introduced to handle the non-differentiable quantization during training. However, no quantitative analysis, ablation studies, or results are provided to demonstrate that this strategy does not introduce significant performance degradation in time-varying channels, which is central to the low-SNR robustness claim.
minor comments (1)
  1. [Abstract] The abstract mentions 'simulation results' but does not specify the exact metrics, baselines, or how low SNR is defined, which would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful for the referee's thorough review and constructive feedback. Below, we provide point-by-point responses to the major comments and outline the revisions we will make to address them.

read point-by-point responses
  1. Referee: [Simulation results] The evaluation is based solely on image reconstruction metrics such as PSNR and SSIM on the Cityscapes dataset. However, for the claimed application in autonomous driving collaborative perception, downstream task performance metrics (e.g., mean Average Precision for object detection or Intersection over Union for segmentation) are required to verify that semantic content survives the channel noise and quantization. Without these, the assertion of 'significant advantages' for autonomous driving lacks direct support.

    Authors: We agree that evaluating downstream tasks is important to fully validate the benefits for autonomous driving. Although our manuscript emphasizes image reconstruction as an indicator of semantic fidelity, we will enhance the experimental section by incorporating object detection performance metrics. Specifically, we will apply a standard object detector to the reconstructed images and report mAP values across SNR levels, comparing our method to baselines. This addition will provide direct evidence for the semantic preservation under low SNR and quantization. revision: yes

  2. Referee: [Proposed method (quantization)] The training-deployment separation strategy is introduced to handle the non-differentiable quantization during training. However, no quantitative analysis, ablation studies, or results are provided to demonstrate that this strategy does not introduce significant performance degradation in time-varying channels, which is central to the low-SNR robustness claim.

    Authors: The separation strategy decouples the training (using straight-through estimator or similar for differentiability) from deployment (using actual quantization). We recognize that additional validation is needed. In the revised manuscript, we will present ablation studies showing the performance impact of this strategy. This includes comparisons in both static and time-varying channels (e.g., using Jakes' model for Doppler effects) at low SNR, with metrics like PSNR to quantify any degradation. These results will confirm the robustness of the approach. revision: yes

Circularity Check

0 steps flagged

No significant circularity; results are empirical simulation outcomes

full rationale

The paper presents a proposed lightweight JSCC architecture with structured pruning via batch-norm scaling factors plus L1 regularization, followed by a uniform quantization/M-QAM scheme and a training-deployment separation strategy to handle non-differentiability. All performance claims rest on direct Cityscapes simulations that compare reconstruction metrics (implicitly PSNR/SSIM) against conventional baselines at varying SNRs. No derivation chain exists that reduces a claimed prediction or first-principles result back to its own fitted inputs or self-definitions; the methods are described as explicit design choices whose outputs are measured rather than algebraically forced. Any self-citations are peripheral and not load-bearing for the central empirical findings.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard assumptions in machine learning pruning and digital communication systems, with no new entities introduced.

axioms (2)
  • domain assumption The Cityscapes dataset is representative of real-world autonomous driving scenes for evaluating image reconstruction quality.
    Used for simulation results.
  • standard math Wireless channels can be modeled with time-varying fading and additive noise for low SNR conditions.
    Standard in communication simulations.

pith-pipeline@v0.9.0 · 5488 in / 1258 out tokens · 42569 ms · 2026-05-10T00:02:11.949741+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 1 canonical work pages · 1 internal anchor

  1. [1]

    Collaborative computation in integrated sensing, communication, and computation system for autonomous driving,

    R. Ren, J. Zhao, D. Zou, Q. Zhang, D. Wang, and W. Xu, “Collaborative computation in integrated sensing, communication, and computation system for autonomous driving,” IEEE Transactions on Intelligent Transportation Systems, vol. 27, no. 1, pp. 883–894, Jan. 2026

  2. [2]

    IoV-oriented integrated sensing, computation, and communication: System design and resource allocation,

    J. Zhao, R. Ren, D. Zou, Q. Zhang, and W. Xu, “IoV-oriented integrated sensing, computation, and communication: System design and resource allocation,” IEEE Transactions on V ehicular Technology, vol. 73, no. 11, pp. 16283–16294, Nov. 2024. 9

  3. [3]

    UA V-assisted collaborative sensing task offloading and resource allocation in IoV,

    R. Ren, J. Zhao, and Q. Zhang, “UA V-assisted collaborative sensing task offloading and resource allocation in IoV,” IEEE Transactions on V ehicular Technology, vol. 75, no. 4, pp. 6806–6815, Apr. 2026

  4. [4]

    V2V cooperative perception with adaptive communi- cation loss for autonomous driving,

    J. Shi and et al., “V2V cooperative perception with adaptive communi- cation loss for autonomous driving,” IEEE Transactions on Intelligent Transportation Systems, vol. 26, no. 10, pp. 14866–14878, Oct. 2025

  5. [5]

    Jamming and eavesdropping defense scheme based on deep reinforcement learning in autonomous vehicle networks,

    Y . Y ao and et al., “Jamming and eavesdropping defense scheme based on deep reinforcement learning in autonomous vehicle networks,” IEEE Transactions on Information F orensics and Security , vol. 18, pp. 1211– 1224, Jan. 2023

  6. [6]

    Communication beyond transmitting bits: Semantics-guided source and channel coding,

    J. Dai, P . Zhang, K. Niu, S. Wang, Z. Si, and X. Qin, “Communication beyond transmitting bits: Semantics-guided source and channel coding,” IEEE Wireless Communications, vol. 30, no. 4, pp. 170–177, Aug. 2023

  7. [7]

    A Survey on Robust Deep Joint Source-Channel Coding for Semantic Communications

    E. Hong, T. Park, and Y . Kim, “A survey on robust deep joint source-channel coding for semantic communications,” arXiv e-prints , p. arXiv:2604.04413, Apr. 2026

  8. [8]

    Semantic communi- cation for edge intelligence enabled autonomous driving system,

    Y . Feng, H. Shen, Z. Shan, Q. Y ang, and X. Shi, “Semantic communi- cation for edge intelligence enabled autonomous driving system,” IEEE Network, vol. 39, no .2, pp. 149–157, Mar. 2025

  9. [9]

    Deep joint source- channel coding for wireless image transmission,

    E. Bourtsoulatze, D. Burth Kurka, and D. Gündüz, “Deep joint source- channel coding for wireless image transmission,” IEEE Transactions on Cognitive Communications and Networking , vol. 5, no. 3, pp. 567–579, May 2019

  10. [10]

    Demo: Real-time semantic communications with a vision transformer,

    H. Y oo and et al., “Demo: Real-time semantic communications with a vision transformer,” in 2022 IEEE International Conference on Commu- nications Workshops (ICC Workshops) , pp. 1–2, 2022

  11. [11]

    Image segmentation semantic communica- tion over internet of vehicles,

    Q. Pan, H. Tong, and et al., “Image segmentation semantic communica- tion over internet of vehicles,” in 2023 IEEE Wireless Communications and Networking Conference (WCNC) , pp. 1–6, 2023

  12. [12]

    An artificial intelligent-driven semantic com- munication framework for connected autonomous vehicular network,

    A. Deb Raha and et al., “An artificial intelligent-driven semantic com- munication framework for connected autonomous vehicular network,” in 2023 International Conference on Information Networking (ICOIN) , pp. 352–357, 2023

  13. [13]

    Semantic communication for the internet of vehicles: A multiuser cooperative approach,

    W. Xu, Y . Zhang, and et al., “Semantic communication for the internet of vehicles: A multiuser cooperative approach,” IEEE V ehicular Technology Magazine, vol. 18, pp. 100–109, Mar. 2023

  14. [14]

    Deep joint source-channel coding for wireless image transmission with adaptive rate control,

    M. Y ang and H.-S. Kim, “Deep joint source-channel coding for wireless image transmission with adaptive rate control,” in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pp. 5193–5197, 2022

  15. [15]

    Predictive and adaptive deep coding for wireless image transmission in semantic communication,

    W. Zhang, H. Zhang, and et al., “Predictive and adaptive deep coding for wireless image transmission in semantic communication,” IEEE Transactions on Wireless Communications , vol. 22, no. 8, pp. 5486– 5501, 2023

  16. [16]

    DRL beamforming in RIS-aided IoV for integrated-sensing-communication-computation,

    R. Ren, J. Zhao, Q. Zhang, D. Wang, and J. Li, “DRL beamforming in RIS-aided IoV for integrated-sensing-communication-computation,” IEEE Internet of Things Journal , vol. 12, no .14, pp. 28201–28213, Jul. 2025

  17. [17]

    Learning based joint coding- modulation for digital semantic communication systems,

    Y . Bo, Y . Duan, S. Shao, and M. Tao, “Learning based joint coding- modulation for digital semantic communication systems,” in 2022 14th International Conference on Wireless Communications and Signal Pro- cessing (WCSP) , pp. 1–6, 2022

  18. [18]

    Joint coding-modulation for digital semantic communications via variational autoencoder,

    Y . Bo, Y . Duan, and et al., “Joint coding-modulation for digital semantic communications via variational autoencoder,” IEEE Transactions on Communications, vol. 72, no. 9, pp. 5626–5640, 2024

  19. [19]

    Image quality assess- ment: from error visibility to structural similarity,

    Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assess- ment: from error visibility to structural similarity,” IEEE Transactions on Image Processing , vol. 13, no. 4, pp. 600–612, Apr. 2004

  20. [20]

    DepGraph: Towards any structural pruning,

    G. Fang and et al., “DepGraph: Towards any structural pruning,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16091–16101, 2023

  21. [21]

    Learning efficient convolutional networks through network slimming,

    Z. Liu and et al., “Learning efficient convolutional networks through network slimming,” 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2755–2763, 2017

  22. [22]

    The cityscapes dataset for semantic urban scene understanding,

    M. Cordts and et al., “The cityscapes dataset for semantic urban scene understanding,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , pp. 3213–3223, 2016