pith. machine review for the scientific record. sign in

arxiv: 2604.22816 · v1 · submitted 2026-04-14 · 📡 eess.SP · cs.AI· cs.LG

Recognition: unknown

Applied AI-Enhanced RF Interference Rejection

Authors on Pith no claims yet

Pith reviewed 2026-05-10 14:24 UTC · model grok-4.3

classification 📡 eess.SP cs.AIcs.LG
keywords RF interference rejectiontransformer decoderOFDM interferenceFM radioPESQAI signal processingtactical communications
0
0 comments X

The pith

Autoregressive transformer decoders suppress OFDM interference in analog FM signals to restore intelligibility with low latency.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that deep learning models trained on both clean signals of interest and their interfered mixtures can reject RF interference without needing design details about the interferer or propagation. Autoregressive transformer decoder architectures achieve this for an FM walkie-talkie signal mixed with OFDM interference, turning unintelligible audio into intelligible speech according to PESQ scores. These models deliver orders-of-magnitude faster inference than earlier WaveNet approaches while running on lightweight GPUs such as the Jetson AGX Orin to keep total latency low. The approach targets tactical environments where such interferers are common. A sympathetic reader would care because it offers a path to maintain communications in crowded spectrum without exhaustive modeling of every possible interferer.

Core claim

Autoregressive Transformer Decoder models trained on signal-of-interest and signal-mixture pairs mitigate OFDM interference in analog FM transmissions, producing intelligible output where prior methods fail, while providing orders-of-magnitude faster inference throughput than WaveNet models and maintaining low latency on edge GPUs.

What carries the argument

Autoregressive Transformer Decoder model trained on SOI-plus-interference pairs to reconstruct the original signal from the mixture without explicit interferer knowledge.

Load-bearing premise

Models trained on specific signal-of-interest and interference pairs will generalize to real-world propagation conditions and interferer variations without requiring detailed design-level knowledge of the interfering signal.

What would settle it

A side-by-side PESQ comparison between the trained model output and ground-truth clean FM audio on live over-the-air captures with actual OFDM interferers under uncontrolled outdoor propagation.

Figures

Figures reproduced from arXiv: 2604.22816 by Alexia Schulz, Joey Botero, Pierre Trepagnier, Rahul Jain, Rick Gentile.

Figure 1
Figure 1. Figure 1: Adapted from [1], [5]. Example of a co-channel interference scenario [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: RF Transformer (left) & RF Transformer Decoder (right), the [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Data preparation process for the training pipeline. The interference is sampled at [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Audio intelligibility evaluation for analog radio experiment. Metrics [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The impact of batching on inference time. As the batch size grows, the [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: A conceptual setup for the real-time streaming application. The SOI and OFDM interference can be mixed either over the wire via the RF combiner [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
read the original abstract

AI-enhanced interference rejection in radio frequency (RF) transmissions has recently attracted interest because deep learning approaches trained on both the signal of interest (SOI) and the signal mixture (SOI plus interference) can outperform traditional approaches which only consider the SOI. The goal is to detect, demodulate, and decode signals over a range of signal-to-interference-plus-noise (SINR) levels without having a detailed, design-level knowledge of the interfering signal or the propagation conditions. Our present AI interference suppression results are based on Autoregressive Transformer Decoder models which exhibit orders of magnitude faster throughput at inference time than WaveNet models developed in earlier work. As a specific example, we investigate an analog FM "Walkie Talkie" radio signal of interest in the presence of an Orthogonal Frequency-Division Multiplexing (OFDM) interferer. This type of interferer is near-ubiquitous in the current RF landscape. Our results clearly show the benefits of transformer-based interference mitigation in tactical settings. We show that unintelligible transmissions become intelligible via metrics such as Perceptual Evaluation of Speech Quality (PESQ), while overall latency is kept to a minimum using readily available lightweight GPUs such as a Jetson AGX Orin. We believe these same techniques can also be applied to a broader set of national security scenarios, as well as having commercial applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript describes the application of autoregressive transformer decoder models to suppress OFDM interference in analog FM radio signals of interest. It highlights that these models achieve orders of magnitude faster inference throughput compared to earlier WaveNet models, restore intelligibility to previously unintelligible transmissions as quantified by the Perceptual Evaluation of Speech Quality (PESQ) metric, and support low-latency operation on embedded GPUs such as the Jetson AGX Orin. The approach aims to work without detailed design-level knowledge of the interferer or propagation conditions, with potential extensions to broader national security and commercial scenarios.

Significance. If the performance gains and generalization hold under rigorous testing, this work could significantly advance practical deployment of deep learning for real-time RF interference mitigation in tactical environments. The shift to transformer decoders for improved speed addresses a key limitation of prior generative models, potentially enabling on-device processing in resource-constrained settings.

major comments (2)
  1. [Abstract] Abstract: The abstract states qualitative benefits and a speed comparison but supplies no quantitative results, training details, dataset descriptions, baselines, or error analysis; central performance claims cannot be evaluated from available text.
  2. [Abstract] Abstract: No evidence is provided of testing on altered interferer parameters such as subcarrier spacing, bandwidth, power, or multipath, or different propagation conditions, which is required to support the generalization assumption underlying the central claim.
minor comments (1)
  1. [Abstract] Abstract: Consider adding a reference to the earlier WaveNet work for context on the speed improvement.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the opportunity to respond to the referee's comments. We have carefully considered the feedback and will make revisions to the manuscript, particularly to the abstract, to address the concerns raised.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The abstract states qualitative benefits and a speed comparison but supplies no quantitative results, training details, dataset descriptions, baselines, or error analysis; central performance claims cannot be evaluated from available text.

    Authors: We acknowledge that the abstract is written at a high level to provide an overview. The full paper includes quantitative results such as PESQ metric improvements, inference throughput comparisons (orders of magnitude faster than WaveNet), training dataset details for the FM SOI and OFDM interferer mixtures, model baselines, and analysis. To make the abstract self-contained for evaluation, we will revise it to incorporate key quantitative highlights, including specific performance metrics and a brief mention of the experimental setup. revision: yes

  2. Referee: [Abstract] Abstract: No evidence is provided of testing on altered interferer parameters such as subcarrier spacing, bandwidth, power, or multipath, or different propagation conditions, which is required to support the generalization assumption underlying the central claim.

    Authors: The presented work focuses on demonstrating the effectiveness of autoregressive transformer decoders for a specific but representative case of OFDM interference in FM radio signals. The model is designed to operate without requiring detailed knowledge of the interferer or propagation conditions. However, we agree that explicit testing across varied parameters would strengthen the generalization claims. We will revise the abstract and add a discussion in the manuscript to clarify the scope of the current experiments and identify testing on altered interferer parameters and propagation conditions as important future work. revision: yes

Circularity Check

0 steps flagged

No derivation chain present; purely empirical results

full rationale

The paper reports experimental outcomes from training autoregressive transformer decoder models on specific SOI-plus-interference mixtures and evaluating intelligibility restoration via PESQ metrics. No equations, derivations, fitted parameters renamed as predictions, or self-referential definitions appear in the abstract or described results. The central claims rest on measured throughput and quality improvements rather than any mathematical reduction to inputs or self-citation load-bearing premises. Minor reference to prior WaveNet work is not used to justify any uniqueness theorem or ansatz. The work is self-contained as an empirical demonstration.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical axioms, free parameters, or invented entities are introduced or required in the abstract; the work is an empirical application of existing transformer architectures to a signal-processing task.

pith-pipeline@v0.9.0 · 5550 in / 1109 out tokens · 26906 ms · 2026-05-10T14:24:25.224972+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 6 canonical work pages · 1 internal anchor

  1. [1]

    Rf challenge: The data-driven radio frequency signal separation challenge,

    A. Lancho, A. Weiss, G. C. F. Lee, T. Jayashankar, B. G. Kurien, Y . Polyanskiy, and G. W. Wornell, “Rf challenge: The data-driven radio frequency signal separation challenge,”IEEE Open Journal of the Communications Society, vol. 6, pp. 4083–4100, 2025

  2. [2]

    Score estimation for generative modeling,

    T. K. Jayashankar, “Score estimation for generative modeling,” Doctoral Thesis, Massachusetts Institute of Technology, May

  3. [3]

    Available: https://sia.mit.edu/wp-content/uploads/2025/ 08/2025-jayashankar-phd.pdf

    [Online]. Available: https://sia.mit.edu/wp-content/uploads/2025/ 08/2025-jayashankar-phd.pdf

  4. [4]

    The radio-frequency transformer for signal separation,

    E. Lifar, S. Savkin, R. Madhukara, T. Jayashankar, Y . Polyanskiy, and G. W. Wornell, “The radio-frequency transformer for signal separation,”

  5. [5]

    Available: https://arxiv.org/abs/2603.09201

    [Online]. Available: https://arxiv.org/abs/2603.09201

  6. [6]

    Trepagnier and A

    P. Trepagnier and A. Wollaber,Case Study B: AI Agents for the Tactical Edge. Cham: Springer International Publishing, 2023, pp. 409–424. [Online]. Available: https://doi.org/10.1007/978-3-031-29269-9 20

  7. [7]

    Advancing ai challenges for the united states department of the air force*,

    C. Prothmann, V . Gadepally, J. Kepneret al., “Advancing ai challenges for the united states department of the air force*,” in2025 IEEE High Performance Extreme Computing Conference (HPEC), 2025, pp. 1–8

  8. [8]

    J. G. Proakis and M. Salehi,Digital Communications, 5th ed. New York, NY: McGraw-Hill Education, 2008

  9. [9]

    S. M. Kay,Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory. Upper Saddle River, NJ: Prentice Hall PTR, 1993

  10. [10]

    Multi-user detection for ds-cdma communications,

    S. Moshavi, “Multi-user detection for ds-cdma communications,”IEEE Personal Communications Magazine, vol. 3, no. 5, pp. 27–36, October 1996

  11. [11]

    WaveNet: A Generative Model for Raw Audio

    A. van den Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior, and K. Kavukcuoglu, “Wavenet: A generative model for raw audio,” inArxiv, 2016. [Online]. Available: https://arxiv.org/abs/1609.03499

  12. [12]

    Attention is all you need,

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” inProceedings of the 31st International Conference on Neural Information Processing Systems, ser. NIPS’17. Red Hook, NY , USA: Curran Associates Inc., 2017, p. 6000–6010

  13. [13]

    Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone networks and codecs,

    A. Rix, J. Beerends, M. Hollier, and A. Hekstra, “Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone networks and codecs,” in2001 IEEE International Con- ference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), vol. 2, 2001, pp. 749–752 vol.2

  14. [14]

    Pesq (perceptual evaluation of speech quality) wrapper for python users,

    R. G. D. Miao Wang, Christoph Boeddeker and ananda seelan, “Pesq (perceptual evaluation of speech quality) wrapper for python users,” May 2022. [Online]. Available: https://doi.org/10.5281/zenodo.6549559

  15. [15]

    mir eval: A transparent implementation of common mir metrics,

    C. Raffel, B. McFee, E. J. Humphrey, J. Salamon, O. Nieto, D. Liang, and D. P. W. Ellis, “mir eval: A transparent implementation of common mir metrics,” inProceedings of the 15th International Conference on Music Information Retrieval (ISMIR). Taipei, Taiwan: International Society for Music Information Retrieval, November 2014, pp. 367–372. Fig. 8. A conc...

  16. [16]

    librosa/librosa: 0.11.0,

    B. McFee, M. McVicar, D. Faronbiet al., “librosa/librosa: 0.11.0,” Mar

  17. [17]

    librosa/librosa: 0.11.0 , month=

    [Online]. Available: https://doi.org/10.5281/zenodo.15006942

  18. [18]

    mel-cepstral-distance,

    S. Taubert and J. Sternkopf, “mel-cepstral-distance,” Apr. 2025. [Online]. Available: https://github.com/stefantaubert/mel-cepstral-distance

  19. [19]

    pystoi: Python implementation of the short-time objective intelligibility (stoi) metric,

    M. Pariente and contributors, “pystoi: Python implementation of the short-time objective intelligibility (stoi) metric,” 2020. [Online]. Available: https://github.com/mpariente/pystoi

  20. [20]

    IEEE/ACM Transactions on Audio, Speech, and Language Processing , author =

    J. Jensen, C. H. Taal, J. Jensen, and C. H. Taal, “An algorithm for predicting the intelligibility of speech masked by modulated noise maskers,”IEEE/ACM Trans. Audio, Speech and Lang. Proc., vol. 24, no. 11, p. 2009–2022, Nov. 2016. [Online]. Available: https://doi.org/10.1109/TASLP.2016.2585878

  21. [21]

    ITU-T Recommendation P.863: Perceptual Objective Listening Quality Analysis (POLQA),

    International Telecommunication Union, “ITU-T Recommendation P.863: Perceptual Objective Listening Quality Analysis (POLQA),” International Telecommunication Union, Tech. Rep., 2011. [Online]. Available: https://www.itu.int/rec/T-REC-P.863

  22. [22]

    Interactive supercomputing on 40,000 cores for machine learning and data analysis,

    A. Reuther, J. Kepner, C. Byun, S. Samsi, W. Arcand, D. Bestor, B. Bergeron, V . Gadepally, M. Houle, M. Hubbell, M. Jones, A. Klein, L. Milechin, J. Mullen, A. Prout, A. Rosa, C. Yee, and P. Michaleas, “Interactive supercomputing on 40,000 cores for machine learning and data analysis,” in2018 IEEE High Performance extreme Computing Conference (HPEC), 201...