arxiv: 2604.22816 · v1 · submitted 2026-04-14 · 📡 eess.SP · cs.AI· cs.LG

Recognition: unknown

Applied AI-Enhanced RF Interference Rejection

Rahul Jain , Pierre Trepagnier , Rick Gentile , Joey Botero , Alexia Schulz

Authors on Pith no claims yet

Pith reviewed 2026-05-10 14:24 UTC · model grok-4.3

classification 📡 eess.SP cs.AIcs.LG

keywords RF interference rejectiontransformer decoderOFDM interferenceFM radioPESQAI signal processingtactical communications

0 comments

The pith

Autoregressive transformer decoders suppress OFDM interference in analog FM signals to restore intelligibility with low latency.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that deep learning models trained on both clean signals of interest and their interfered mixtures can reject RF interference without needing design details about the interferer or propagation. Autoregressive transformer decoder architectures achieve this for an FM walkie-talkie signal mixed with OFDM interference, turning unintelligible audio into intelligible speech according to PESQ scores. These models deliver orders-of-magnitude faster inference than earlier WaveNet approaches while running on lightweight GPUs such as the Jetson AGX Orin to keep total latency low. The approach targets tactical environments where such interferers are common. A sympathetic reader would care because it offers a path to maintain communications in crowded spectrum without exhaustive modeling of every possible interferer.

Core claim

Autoregressive Transformer Decoder models trained on signal-of-interest and signal-mixture pairs mitigate OFDM interference in analog FM transmissions, producing intelligible output where prior methods fail, while providing orders-of-magnitude faster inference throughput than WaveNet models and maintaining low latency on edge GPUs.

What carries the argument

Autoregressive Transformer Decoder model trained on SOI-plus-interference pairs to reconstruct the original signal from the mixture without explicit interferer knowledge.

Load-bearing premise

Models trained on specific signal-of-interest and interference pairs will generalize to real-world propagation conditions and interferer variations without requiring detailed design-level knowledge of the interfering signal.

What would settle it

A side-by-side PESQ comparison between the trained model output and ground-truth clean FM audio on live over-the-air captures with actual OFDM interferers under uncontrolled outdoor propagation.

Figures

Figures reproduced from arXiv: 2604.22816 by Alexia Schulz, Joey Botero, Pierre Trepagnier, Rahul Jain, Rick Gentile.

**Figure 3.** Figure 3: RF Transformer (left) & RF Transformer Decoder (right), the [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Data preparation process for the training pipeline. The interference is sampled at [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 6.** Figure 6: Audio intelligibility evaluation for analog radio experiment. Metrics [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗

**Figure 7.** Figure 7: The impact of batching on inference time. As the batch size grows, the [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 8.** Figure 8: A conceptual setup for the real-time streaming application. The SOI and OFDM interference can be mixed either over the wire via the RF combiner [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

read the original abstract

AI-enhanced interference rejection in radio frequency (RF) transmissions has recently attracted interest because deep learning approaches trained on both the signal of interest (SOI) and the signal mixture (SOI plus interference) can outperform traditional approaches which only consider the SOI. The goal is to detect, demodulate, and decode signals over a range of signal-to-interference-plus-noise (SINR) levels without having a detailed, design-level knowledge of the interfering signal or the propagation conditions. Our present AI interference suppression results are based on Autoregressive Transformer Decoder models which exhibit orders of magnitude faster throughput at inference time than WaveNet models developed in earlier work. As a specific example, we investigate an analog FM "Walkie Talkie" radio signal of interest in the presence of an Orthogonal Frequency-Division Multiplexing (OFDM) interferer. This type of interferer is near-ubiquitous in the current RF landscape. Our results clearly show the benefits of transformer-based interference mitigation in tactical settings. We show that unintelligible transmissions become intelligible via metrics such as Perceptual Evaluation of Speech Quality (PESQ), while overall latency is kept to a minimum using readily available lightweight GPUs such as a Jetson AGX Orin. We believe these same techniques can also be applied to a broader set of national security scenarios, as well as having commercial applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper adapts autoregressive transformer decoders to suppress OFDM interference in FM signals with a speed advantage over WaveNet, but the abstract supplies no numbers or details to support the claims.

read the letter

The paper's main contribution is showing that autoregressive transformer decoder models can be used for suppressing OFDM interference in analog FM transmissions, with the advantage of much quicker inference compared to the WaveNet models from previous studies. They demonstrate this on a walkie-talkie style signal and claim that the output becomes intelligible again according to PESQ scores, all while running efficiently on accessible hardware like the Jetson AGX Orin. What stands out positively is the practical orientation. The authors target a scenario common in both tactical and commercial RF environments, where the interferer is OFDM-based and you cannot assume knowledge of its exact parameters. By focusing on end-to-end learning from mixtures, they align with the goal of handling unknown propagation and interference without custom design. The emphasis on low latency for real-time use is a good fit for the application. On the downside, the abstract provides almost no hard evidence. It mentions qualitative benefits and a speed comparison but gives no actual figures for PESQ improvement, no inference time measurements, no information on the datasets used, training procedures, or baseline comparisons beyond the high-level WaveNet reference. This makes it difficult to assess whether the claims hold. The concern about generalization is valid here too; models trained on fixed signal pairs may not transfer well to different interferer settings or channel conditions without additional validation. Overall, this is the kind of applied paper that could interest engineers working on robust communications systems or ML practitioners looking for domain-specific adaptations of transformer architectures. It does not appear to introduce new theory or broad methods, but the specific combination for this RF problem could be worth exploring if the experiments are solid. I recommend engaging with it through peer review, provided the full paper includes the detailed results and some checks on robustness to variations in the interference.

Referee Report

2 major / 1 minor

Summary. The manuscript describes the application of autoregressive transformer decoder models to suppress OFDM interference in analog FM radio signals of interest. It highlights that these models achieve orders of magnitude faster inference throughput compared to earlier WaveNet models, restore intelligibility to previously unintelligible transmissions as quantified by the Perceptual Evaluation of Speech Quality (PESQ) metric, and support low-latency operation on embedded GPUs such as the Jetson AGX Orin. The approach aims to work without detailed design-level knowledge of the interferer or propagation conditions, with potential extensions to broader national security and commercial scenarios.

Significance. If the performance gains and generalization hold under rigorous testing, this work could significantly advance practical deployment of deep learning for real-time RF interference mitigation in tactical environments. The shift to transformer decoders for improved speed addresses a key limitation of prior generative models, potentially enabling on-device processing in resource-constrained settings.

major comments (2)

[Abstract] Abstract: The abstract states qualitative benefits and a speed comparison but supplies no quantitative results, training details, dataset descriptions, baselines, or error analysis; central performance claims cannot be evaluated from available text.
[Abstract] Abstract: No evidence is provided of testing on altered interferer parameters such as subcarrier spacing, bandwidth, power, or multipath, or different propagation conditions, which is required to support the generalization assumption underlying the central claim.

minor comments (1)

[Abstract] Abstract: Consider adding a reference to the earlier WaveNet work for context on the speed improvement.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the opportunity to respond to the referee's comments. We have carefully considered the feedback and will make revisions to the manuscript, particularly to the abstract, to address the concerns raised.

read point-by-point responses

Referee: [Abstract] Abstract: The abstract states qualitative benefits and a speed comparison but supplies no quantitative results, training details, dataset descriptions, baselines, or error analysis; central performance claims cannot be evaluated from available text.

Authors: We acknowledge that the abstract is written at a high level to provide an overview. The full paper includes quantitative results such as PESQ metric improvements, inference throughput comparisons (orders of magnitude faster than WaveNet), training dataset details for the FM SOI and OFDM interferer mixtures, model baselines, and analysis. To make the abstract self-contained for evaluation, we will revise it to incorporate key quantitative highlights, including specific performance metrics and a brief mention of the experimental setup. revision: yes
Referee: [Abstract] Abstract: No evidence is provided of testing on altered interferer parameters such as subcarrier spacing, bandwidth, power, or multipath, or different propagation conditions, which is required to support the generalization assumption underlying the central claim.

Authors: The presented work focuses on demonstrating the effectiveness of autoregressive transformer decoders for a specific but representative case of OFDM interference in FM radio signals. The model is designed to operate without requiring detailed knowledge of the interferer or propagation conditions. However, we agree that explicit testing across varied parameters would strengthen the generalization claims. We will revise the abstract and add a discussion in the manuscript to clarify the scope of the current experiments and identify testing on altered interferer parameters and propagation conditions as important future work. revision: yes

Circularity Check

0 steps flagged

No derivation chain present; purely empirical results

full rationale

The paper reports experimental outcomes from training autoregressive transformer decoder models on specific SOI-plus-interference mixtures and evaluating intelligibility restoration via PESQ metrics. No equations, derivations, fitted parameters renamed as predictions, or self-referential definitions appear in the abstract or described results. The central claims rest on measured throughput and quality improvements rather than any mathematical reduction to inputs or self-citation load-bearing premises. Minor reference to prior WaveNet work is not used to justify any uniqueness theorem or ansatz. The work is self-contained as an empirical demonstration.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical axioms, free parameters, or invented entities are introduced or required in the abstract; the work is an empirical application of existing transformer architectures to a signal-processing task.

pith-pipeline@v0.9.0 · 5550 in / 1109 out tokens · 26906 ms · 2026-05-10T14:24:25.224972+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 6 canonical work pages · 1 internal anchor

[1]

Rf challenge: The data-driven radio frequency signal separation challenge,

A. Lancho, A. Weiss, G. C. F. Lee, T. Jayashankar, B. G. Kurien, Y . Polyanskiy, and G. W. Wornell, “Rf challenge: The data-driven radio frequency signal separation challenge,”IEEE Open Journal of the Communications Society, vol. 6, pp. 4083–4100, 2025

2025
[2]

Score estimation for generative modeling,

T. K. Jayashankar, “Score estimation for generative modeling,” Doctoral Thesis, Massachusetts Institute of Technology, May
[3]

Available: https://sia.mit.edu/wp-content/uploads/2025/ 08/2025-jayashankar-phd.pdf

[Online]. Available: https://sia.mit.edu/wp-content/uploads/2025/ 08/2025-jayashankar-phd.pdf

2025
[4]

The radio-frequency transformer for signal separation,

E. Lifar, S. Savkin, R. Madhukara, T. Jayashankar, Y . Polyanskiy, and G. W. Wornell, “The radio-frequency transformer for signal separation,”
[5]

Available: https://arxiv.org/abs/2603.09201

[Online]. Available: https://arxiv.org/abs/2603.09201

work page arXiv
[6]

Trepagnier and A

P. Trepagnier and A. Wollaber,Case Study B: AI Agents for the Tactical Edge. Cham: Springer International Publishing, 2023, pp. 409–424. [Online]. Available: https://doi.org/10.1007/978-3-031-29269-9 20

work page doi:10.1007/978-3-031-29269-9 2023
[7]

Advancing ai challenges for the united states department of the air force*,

C. Prothmann, V . Gadepally, J. Kepneret al., “Advancing ai challenges for the united states department of the air force*,” in2025 IEEE High Performance Extreme Computing Conference (HPEC), 2025, pp. 1–8

2025
[8]

J. G. Proakis and M. Salehi,Digital Communications, 5th ed. New York, NY: McGraw-Hill Education, 2008

2008
[9]

S. M. Kay,Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory. Upper Saddle River, NJ: Prentice Hall PTR, 1993

1993
[10]

Multi-user detection for ds-cdma communications,

S. Moshavi, “Multi-user detection for ds-cdma communications,”IEEE Personal Communications Magazine, vol. 3, no. 5, pp. 27–36, October 1996

1996
[11]

WaveNet: A Generative Model for Raw Audio

A. van den Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior, and K. Kavukcuoglu, “Wavenet: A generative model for raw audio,” inArxiv, 2016. [Online]. Available: https://arxiv.org/abs/1609.03499

work page internal anchor Pith review arXiv 2016
[12]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” inProceedings of the 31st International Conference on Neural Information Processing Systems, ser. NIPS’17. Red Hook, NY , USA: Curran Associates Inc., 2017, p. 6000–6010

2017
[13]

Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone networks and codecs,

A. Rix, J. Beerends, M. Hollier, and A. Hekstra, “Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone networks and codecs,” in2001 IEEE International Con- ference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), vol. 2, 2001, pp. 749–752 vol.2

2001
[14]

Pesq (perceptual evaluation of speech quality) wrapper for python users,

R. G. D. Miao Wang, Christoph Boeddeker and ananda seelan, “Pesq (perceptual evaluation of speech quality) wrapper for python users,” May 2022. [Online]. Available: https://doi.org/10.5281/zenodo.6549559

work page doi:10.5281/zenodo.6549559 2022
[15]

mir eval: A transparent implementation of common mir metrics,

C. Raffel, B. McFee, E. J. Humphrey, J. Salamon, O. Nieto, D. Liang, and D. P. W. Ellis, “mir eval: A transparent implementation of common mir metrics,” inProceedings of the 15th International Conference on Music Information Retrieval (ISMIR). Taipei, Taiwan: International Society for Music Information Retrieval, November 2014, pp. 367–372. Fig. 8. A conc...

2014
[16]

librosa/librosa: 0.11.0,

B. McFee, M. McVicar, D. Faronbiet al., “librosa/librosa: 0.11.0,” Mar
[17]

librosa/librosa: 0.11.0 , month=

[Online]. Available: https://doi.org/10.5281/zenodo.15006942

work page doi:10.5281/zenodo.15006942
[18]

mel-cepstral-distance,

S. Taubert and J. Sternkopf, “mel-cepstral-distance,” Apr. 2025. [Online]. Available: https://github.com/stefantaubert/mel-cepstral-distance

2025
[19]

pystoi: Python implementation of the short-time objective intelligibility (stoi) metric,

M. Pariente and contributors, “pystoi: Python implementation of the short-time objective intelligibility (stoi) metric,” 2020. [Online]. Available: https://github.com/mpariente/pystoi

2020
[20]

IEEE/ACM Transactions on Audio, Speech, and Language Processing , author =

J. Jensen, C. H. Taal, J. Jensen, and C. H. Taal, “An algorithm for predicting the intelligibility of speech masked by modulated noise maskers,”IEEE/ACM Trans. Audio, Speech and Lang. Proc., vol. 24, no. 11, p. 2009–2022, Nov. 2016. [Online]. Available: https://doi.org/10.1109/TASLP.2016.2585878

work page doi:10.1109/taslp.2016.2585878 2009
[21]

ITU-T Recommendation P.863: Perceptual Objective Listening Quality Analysis (POLQA),

International Telecommunication Union, “ITU-T Recommendation P.863: Perceptual Objective Listening Quality Analysis (POLQA),” International Telecommunication Union, Tech. Rep., 2011. [Online]. Available: https://www.itu.int/rec/T-REC-P.863

2011
[22]

Interactive supercomputing on 40,000 cores for machine learning and data analysis,

A. Reuther, J. Kepner, C. Byun, S. Samsi, W. Arcand, D. Bestor, B. Bergeron, V . Gadepally, M. Houle, M. Hubbell, M. Jones, A. Klein, L. Milechin, J. Mullen, A. Prout, A. Rosa, C. Yee, and P. Michaleas, “Interactive supercomputing on 40,000 cores for machine learning and data analysis,” in2018 IEEE High Performance extreme Computing Conference (HPEC), 201...

2018