pith. sign in

arxiv: 2605.04676 · v1 · submitted 2026-05-06 · 📡 eess.SP

RF-Analyzer: Can Vision-Language Models Learn RF Understanding from Synthetic Data?

Pith reviewed 2026-05-08 16:46 UTC · model grok-4.3

classification 📡 eess.SP
keywords vision-language modelssynthetic dataRF spectrogramssignal understandinggeneralizationwireless spectrumphysical attribute extractionSDR platform
0
0 comments X

The pith

Vision-language models can learn to understand real RF signals from synthetic spectrogram data alone.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper asks if vision-language models trained only on computer-generated RF spectrograms can still make sense of actual wireless signals captured from the air. The authors find that the models do generalize, successfully pulling out details like which frequencies are in use, how the signal behaves over time, and its strength level. This matters because collecting large amounts of real RF data for training is costly and logistically difficult, whereas synthetic data can be generated in unlimited quantities. They support this by building RF-Analyzer, a system that links software-defined radios directly to the model for real-time testing, and by defining new metrics to measure how well the model describes the physical properties without hallucinating or leaking prompt information. The results hold for typical conditions but break down when signals are very weak or when the synthetic data misses key variations.

Core claim

VLMs trained exclusively on synthetic spectrogram data can generalize to real over-the-air RF environments, particularly for extracting physical signal attributes such as spectral occupancy, temporal behavior, and SNR. This indicates that synthetic data is sufficient for learning transferable representations of RF signal structure, though generalization is limited without contextual priors and fails in low-SNR regimes.

What carries the argument

RF-Analyzer, an SDR-to-AI analysis platform that pairs live spectrum captures with VLM interpretations and uses metrics like Physical Attribute Extraction Score to evaluate generalization from synthetic to real data.

If this is right

  • VLMs can extract physical attributes from real RF signals after synthetic training.
  • Generalization succeeds for signal properties within the synthetic distribution.
  • Low-SNR regimes and lack of contextual priors limit the transfer.
  • The introduced platform and metrics enable systematic assessment of VLM performance on live RF data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This approach could make advanced spectrum analysis tools more accessible by reducing reliance on expensive real-world datasets.
  • The success suggests potential for applying similar synthetic-data strategies to other signal processing tasks involving visual representations like images of waveforms.
  • Testing with augmented synthetic data that includes more noise variations could improve performance in challenging low-SNR conditions.

Load-bearing premise

The synthetic training data distribution is representative enough of real over-the-air RF variations to support generalization, especially outside the low-SNR regimes explicitly noted as failure cases.

What would settle it

A demonstration that the VLM misidentifies key attributes like spectral occupancy on real signals whose characteristics fall within the range of the synthetic training data.

Figures

Figures reproduced from arXiv: 2605.04676 by Anis Bara, Brahim Mefgouda, Hang Zou, Lina Bariah, Merouane Debbah.

Figure 1
Figure 1. Figure 1: System architecture of RF-Analyzer. Colored blocks indicate functional view at source ↗
Figure 3
Figure 3. Figure 3: RF Analyzer running on an Ubuntu 24 workstation with an Ettus view at source ↗
read the original abstract

Understanding the wireless spectrum is a fundamen- tal requirement for intelligent communication systems, however, interpreting spectrograms requires extracting multiple physical attributes and reasoning about signal structure, which is a capability that is not achieved by traditional ML approaches. Recent advances in vision-language models (VLMs) demonstrated the possibility of learning such interpretation capabilities directly from data. This paper investigates whether VLMs can learn this capability from synthetic data alone, and more importantly, whether such learned representations generalize to real over-the- air RF environments. To address this question, we introduce RF-Analyzer, an SDR-to-AI analysis platform that integrates live spectrum captures associated with the corresponding VLM- based interpretation, enabling direct evaluation of VLMs outputs on live over-the-air signals. Using this platform, we assess a model trained exclusively on synthetic spectrogram data with general-purpose baselines. To enable systematic analysis, we establish a benchmark framework comprising three metrics, Physical Attribute Extraction Score (PAES), Prompt Leakage Rate (PLR), and hallucination count, to assess signal understanding and grounding. The obtained results demonstrate that VLMs trained on synthetic spectrogram data can generalize to real RF environments, particularly for extracting physical signal attributes such as spectral occupancy, temporal behavior, and SNR. This indicates that synthetic data is sufficient for learning transferable representations of RF signal structure. However, this generalization is limited due to the fact that synthetic training does not provide reliable semantic grounding without contextual priors. In particular, generalization breaks under conditions that are not covered in the synthetic distribution, particularly low-SNR regimes

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces the RF-Analyzer SDR-to-AI platform to test whether vision-language models trained exclusively on synthetic spectrogram data can extract physical RF signal attributes (spectral occupancy, temporal behavior, SNR) from live over-the-air captures. It defines three evaluation metrics—Physical Attribute Extraction Score (PAES), Prompt Leakage Rate (PLR), and hallucination count—and reports that the trained VLM generalizes to real signals for these attributes while noting failures in semantic grounding and low-SNR regimes outside the synthetic distribution.

Significance. If the generalization result is robust, the work shows that synthetic data alone can produce transferable representations for physical RF attribute extraction, reducing reliance on scarce real-world labeled captures for spectrum analysis tasks. The RF-Analyzer platform and the PAES/PLR/hallucination benchmark constitute concrete, reusable contributions for evaluating VLM grounding on live SDR data.

major comments (2)
  1. [Abstract] Abstract: the central generalization claim ('VLMs trained on synthetic spectrogram data can generalize to real RF environments, particularly for extracting physical signal attributes') rests on the untested assumption that the synthetic generator reproduces the statistics of real over-the-air effects beyond the explicitly noted low-SNR breakdown; no ablation or quantitative comparison is supplied for multipath, hardware non-idealities, dynamic interference, or SDR-specific artifacts that would shift the input distribution.
  2. [Evaluation / Results] The manuscript provides no experimental details, statistical tests, baseline comparisons, or error analysis to support the reported generalization (reader note: soundness rated 3.0). Without these, the PAES scores cannot be assessed for reliability or compared against traditional ML approaches mentioned in the abstract.
minor comments (2)
  1. [Abstract] The abstract states positive results but does not report numerical PAES values, sample sizes, or confidence intervals; adding these would improve clarity.
  2. [Benchmark Framework] Notation for the three metrics (PAES, PLR, hallucination count) is introduced without a dedicated definitions subsection or table summarizing their formulas and ranges.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and will incorporate revisions to strengthen the manuscript's rigor and clarity.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central generalization claim ('VLMs trained on synthetic spectrogram data can generalize to real RF environments, particularly for extracting physical signal attributes') rests on the untested assumption that the synthetic generator reproduces the statistics of real over-the-air effects beyond the explicitly noted low-SNR breakdown; no ablation or quantitative comparison is supplied for multipath, hardware non-idealities, dynamic interference, or SDR-specific artifacts that would shift the input distribution.

    Authors: We acknowledge that the synthetic generator prioritizes core signal parameters (frequency, bandwidth, modulation, SNR) and does not explicitly simulate all real-world effects such as multipath or hardware non-idealities. The generalization results are based on direct testing via the RF-Analyzer platform on live over-the-air captures, which inherently contain these effects, and the PAES scores reflect performance under those conditions. We agree that explicit analysis of distribution shifts would improve the paper. In revision, we will add a dedicated subsection discussing potential mismatches, including qualitative comparisons of real vs. synthetic spectrograms under multipath and interference, plus limitations in regimes outside the synthetic distribution. revision: partial

  2. Referee: [Evaluation / Results] The manuscript provides no experimental details, statistical tests, baseline comparisons, or error analysis to support the reported generalization (reader note: soundness rated 3.0). Without these, the PAES scores cannot be assessed for reliability or compared against traditional ML approaches mentioned in the abstract.

    Authors: The full manuscript describes the synthetic data generation process, VLM fine-tuning, RF-Analyzer implementation, and the three metrics, with general-purpose VLMs as baselines. We agree that additional rigor is needed for assessing reliability. In the revised manuscript, we will expand the Evaluation section with: full hyperparameter details and training procedure; statistical summaries (means, standard deviations, and confidence intervals) of PAES across multiple real captures; error analysis stratified by SNR and signal type; and explicit numerical comparisons to traditional ML baselines such as CNN classifiers for spectral occupancy. This will include appropriate statistical tests to support the generalization claims. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation with no derivations or self-referential reductions

full rationale

The paper introduces an SDR-based platform and benchmark metrics (PAES, PLR, hallucination count) to compare VLM outputs on synthetic spectrograms versus live over-the-air captures. No equations, derivations, fitted parameters, or first-principles results are claimed. Generalization statements are presented as direct empirical observations with explicit caveats for out-of-distribution cases (low SNR), not as predictions derived from the training distribution by construction. No self-citations, ansatzes, or uniqueness theorems are invoked to support core claims. The work is self-contained as an experimental comparison.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the unverified assumption that synthetic spectrograms capture the essential statistical structure of real RF signals for the attributes tested, plus the validity of the new metrics as proxies for 'understanding'.

axioms (1)
  • domain assumption Synthetic data distribution sufficiently covers real RF variations for physical attribute extraction
    Invoked when claiming generalization from synthetic training to live over-the-air signals.
invented entities (1)
  • RF-Analyzer platform no independent evidence
    purpose: Integrates live SDR captures with VLM-based interpretation for direct evaluation
    New system introduced to enable the reported experiments.

pith-pipeline@v0.9.0 · 5598 in / 1373 out tokens · 57430 ms · 2026-05-08T16:46:56.717793+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages

  1. [1]

    Large generative AI models for telecom: The next big thing?

    L. Bariahet al., “Large generative AI models for telecom: The next big thing?”IEEE Communications Magazine, vol. 62, no. 11, 2024

  2. [2]

    TelecomGPT: A framework to build telecom-specific large language models,

    H. Zouet al., “TelecomGPT: A framework to build telecom-specific large language models,”IEEE Transactions on Machine Learning in Communications and Networking, vol. 3, pp. 948–975, 2025

  3. [3]

    Large language model (LLM) for telecommunications: A comprehensive survey on principles, key techniques, and opportunities,

    H. Zhouet al., “Large language model (LLM) for telecommunications: A comprehensive survey on principles, key techniques, and opportunities,” IEEE Communications Surveys & Tutorials, vol. 27, no. 3, 2025

  4. [4]

    Spectrum analyzers and signal analyz- ers,

    Rohde & Schwarz, “Spectrum analyzers and signal analyz- ers,” https://www.rohde-schwarz.com/us/products/test-and-measurement/ benchtop-analyzers/rs-fsc-spectrum-analyzer 63493-10891.html, 2024, accessed: May 2025

  5. [5]

    Signal analyzers,

    Keysight Technologies, “Signal analyzers,” https://www.keysight.com/us/ en/product/N9000B/cxa-signal-analyzer-multi-touch-9-khz-26-5-ghz. html, 2024, accessed: May 2025

  6. [6]

    Over-the-air deep learning based radio signal classification,

    T. J. O’Shea, T. Roy, and T. C. Clancy, “Over-the-air deep learning based radio signal classification,”IEEE Journal of Selected Topics in Signal Processing, vol. 12, no. 1, pp. 168–179, 2018

  7. [7]

    Large scale radio frequency signal classification,

    L. Boegneret al., “Large scale radio frequency signal classification,” arXiv preprint arXiv:2207.09918, 2022

  8. [8]

    Hierarchical digital modulation classifica- tion using cumulants,

    A. Swami and B. M. Sadler, “Hierarchical digital modulation classifica- tion using cumulants,”IEEE Transactions on Communications, vol. 48, no. 3, pp. 416–429, 2000

  9. [9]

    Deep neural network architectures for modulation classification,

    N. E. West and T. O’Shea, “Deep neural network architectures for modulation classification,” in2017 IEEE 18th Wireless and Microwave Technology Conference (WAMICON), 2017, pp. 1–6

  10. [10]

    Seeing radio: From zero RF priors to explainable modulation recognition with vision language models,

    H. Zouet al., “Seeing radio: From zero RF priors to explainable modulation recognition with vision language models,”arXiv preprint arXiv:2601.13157, 2026

  11. [11]

    2602.14833 , archivePrefix=

    H. Zou, Y . Tian, B. Wang, L. Bariah, S. Lasaulce, C. Huang, and M. Debbah, “RF-GPT: Teaching AI to See the Wireless World,”arXiv preprint arXiv:2602.14833, 2026

  12. [12]

    Efficient memory management for large language model serving with PagedAttention,

    W. Kwonet al., “Efficient memory management for large language model serving with PagedAttention,” inProceedings of the 29th symposium on operating systems principles, 2023, pp. 611–626