Recognition: unknown
Applied AI-Enhanced RF Interference Rejection
Pith reviewed 2026-05-10 14:24 UTC · model grok-4.3
The pith
Autoregressive transformer decoders suppress OFDM interference in analog FM signals to restore intelligibility with low latency.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Autoregressive Transformer Decoder models trained on signal-of-interest and signal-mixture pairs mitigate OFDM interference in analog FM transmissions, producing intelligible output where prior methods fail, while providing orders-of-magnitude faster inference throughput than WaveNet models and maintaining low latency on edge GPUs.
What carries the argument
Autoregressive Transformer Decoder model trained on SOI-plus-interference pairs to reconstruct the original signal from the mixture without explicit interferer knowledge.
Load-bearing premise
Models trained on specific signal-of-interest and interference pairs will generalize to real-world propagation conditions and interferer variations without requiring detailed design-level knowledge of the interfering signal.
What would settle it
A side-by-side PESQ comparison between the trained model output and ground-truth clean FM audio on live over-the-air captures with actual OFDM interferers under uncontrolled outdoor propagation.
Figures
read the original abstract
AI-enhanced interference rejection in radio frequency (RF) transmissions has recently attracted interest because deep learning approaches trained on both the signal of interest (SOI) and the signal mixture (SOI plus interference) can outperform traditional approaches which only consider the SOI. The goal is to detect, demodulate, and decode signals over a range of signal-to-interference-plus-noise (SINR) levels without having a detailed, design-level knowledge of the interfering signal or the propagation conditions. Our present AI interference suppression results are based on Autoregressive Transformer Decoder models which exhibit orders of magnitude faster throughput at inference time than WaveNet models developed in earlier work. As a specific example, we investigate an analog FM "Walkie Talkie" radio signal of interest in the presence of an Orthogonal Frequency-Division Multiplexing (OFDM) interferer. This type of interferer is near-ubiquitous in the current RF landscape. Our results clearly show the benefits of transformer-based interference mitigation in tactical settings. We show that unintelligible transmissions become intelligible via metrics such as Perceptual Evaluation of Speech Quality (PESQ), while overall latency is kept to a minimum using readily available lightweight GPUs such as a Jetson AGX Orin. We believe these same techniques can also be applied to a broader set of national security scenarios, as well as having commercial applications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript describes the application of autoregressive transformer decoder models to suppress OFDM interference in analog FM radio signals of interest. It highlights that these models achieve orders of magnitude faster inference throughput compared to earlier WaveNet models, restore intelligibility to previously unintelligible transmissions as quantified by the Perceptual Evaluation of Speech Quality (PESQ) metric, and support low-latency operation on embedded GPUs such as the Jetson AGX Orin. The approach aims to work without detailed design-level knowledge of the interferer or propagation conditions, with potential extensions to broader national security and commercial scenarios.
Significance. If the performance gains and generalization hold under rigorous testing, this work could significantly advance practical deployment of deep learning for real-time RF interference mitigation in tactical environments. The shift to transformer decoders for improved speed addresses a key limitation of prior generative models, potentially enabling on-device processing in resource-constrained settings.
major comments (2)
- [Abstract] Abstract: The abstract states qualitative benefits and a speed comparison but supplies no quantitative results, training details, dataset descriptions, baselines, or error analysis; central performance claims cannot be evaluated from available text.
- [Abstract] Abstract: No evidence is provided of testing on altered interferer parameters such as subcarrier spacing, bandwidth, power, or multipath, or different propagation conditions, which is required to support the generalization assumption underlying the central claim.
minor comments (1)
- [Abstract] Abstract: Consider adding a reference to the earlier WaveNet work for context on the speed improvement.
Simulated Author's Rebuttal
Thank you for the opportunity to respond to the referee's comments. We have carefully considered the feedback and will make revisions to the manuscript, particularly to the abstract, to address the concerns raised.
read point-by-point responses
-
Referee: [Abstract] Abstract: The abstract states qualitative benefits and a speed comparison but supplies no quantitative results, training details, dataset descriptions, baselines, or error analysis; central performance claims cannot be evaluated from available text.
Authors: We acknowledge that the abstract is written at a high level to provide an overview. The full paper includes quantitative results such as PESQ metric improvements, inference throughput comparisons (orders of magnitude faster than WaveNet), training dataset details for the FM SOI and OFDM interferer mixtures, model baselines, and analysis. To make the abstract self-contained for evaluation, we will revise it to incorporate key quantitative highlights, including specific performance metrics and a brief mention of the experimental setup. revision: yes
-
Referee: [Abstract] Abstract: No evidence is provided of testing on altered interferer parameters such as subcarrier spacing, bandwidth, power, or multipath, or different propagation conditions, which is required to support the generalization assumption underlying the central claim.
Authors: The presented work focuses on demonstrating the effectiveness of autoregressive transformer decoders for a specific but representative case of OFDM interference in FM radio signals. The model is designed to operate without requiring detailed knowledge of the interferer or propagation conditions. However, we agree that explicit testing across varied parameters would strengthen the generalization claims. We will revise the abstract and add a discussion in the manuscript to clarify the scope of the current experiments and identify testing on altered interferer parameters and propagation conditions as important future work. revision: yes
Circularity Check
No derivation chain present; purely empirical results
full rationale
The paper reports experimental outcomes from training autoregressive transformer decoder models on specific SOI-plus-interference mixtures and evaluating intelligibility restoration via PESQ metrics. No equations, derivations, fitted parameters renamed as predictions, or self-referential definitions appear in the abstract or described results. The central claims rest on measured throughput and quality improvements rather than any mathematical reduction to inputs or self-citation load-bearing premises. Minor reference to prior WaveNet work is not used to justify any uniqueness theorem or ansatz. The work is self-contained as an empirical demonstration.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Rf challenge: The data-driven radio frequency signal separation challenge,
A. Lancho, A. Weiss, G. C. F. Lee, T. Jayashankar, B. G. Kurien, Y . Polyanskiy, and G. W. Wornell, “Rf challenge: The data-driven radio frequency signal separation challenge,”IEEE Open Journal of the Communications Society, vol. 6, pp. 4083–4100, 2025
2025
-
[2]
Score estimation for generative modeling,
T. K. Jayashankar, “Score estimation for generative modeling,” Doctoral Thesis, Massachusetts Institute of Technology, May
-
[3]
Available: https://sia.mit.edu/wp-content/uploads/2025/ 08/2025-jayashankar-phd.pdf
[Online]. Available: https://sia.mit.edu/wp-content/uploads/2025/ 08/2025-jayashankar-phd.pdf
2025
-
[4]
The radio-frequency transformer for signal separation,
E. Lifar, S. Savkin, R. Madhukara, T. Jayashankar, Y . Polyanskiy, and G. W. Wornell, “The radio-frequency transformer for signal separation,”
-
[5]
Available: https://arxiv.org/abs/2603.09201
[Online]. Available: https://arxiv.org/abs/2603.09201
-
[6]
P. Trepagnier and A. Wollaber,Case Study B: AI Agents for the Tactical Edge. Cham: Springer International Publishing, 2023, pp. 409–424. [Online]. Available: https://doi.org/10.1007/978-3-031-29269-9 20
-
[7]
Advancing ai challenges for the united states department of the air force*,
C. Prothmann, V . Gadepally, J. Kepneret al., “Advancing ai challenges for the united states department of the air force*,” in2025 IEEE High Performance Extreme Computing Conference (HPEC), 2025, pp. 1–8
2025
-
[8]
J. G. Proakis and M. Salehi,Digital Communications, 5th ed. New York, NY: McGraw-Hill Education, 2008
2008
-
[9]
S. M. Kay,Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory. Upper Saddle River, NJ: Prentice Hall PTR, 1993
1993
-
[10]
Multi-user detection for ds-cdma communications,
S. Moshavi, “Multi-user detection for ds-cdma communications,”IEEE Personal Communications Magazine, vol. 3, no. 5, pp. 27–36, October 1996
1996
-
[11]
WaveNet: A Generative Model for Raw Audio
A. van den Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior, and K. Kavukcuoglu, “Wavenet: A generative model for raw audio,” inArxiv, 2016. [Online]. Available: https://arxiv.org/abs/1609.03499
work page internal anchor Pith review arXiv 2016
-
[12]
Attention is all you need,
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” inProceedings of the 31st International Conference on Neural Information Processing Systems, ser. NIPS’17. Red Hook, NY , USA: Curran Associates Inc., 2017, p. 6000–6010
2017
-
[13]
Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone networks and codecs,
A. Rix, J. Beerends, M. Hollier, and A. Hekstra, “Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone networks and codecs,” in2001 IEEE International Con- ference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221), vol. 2, 2001, pp. 749–752 vol.2
2001
-
[14]
Pesq (perceptual evaluation of speech quality) wrapper for python users,
R. G. D. Miao Wang, Christoph Boeddeker and ananda seelan, “Pesq (perceptual evaluation of speech quality) wrapper for python users,” May 2022. [Online]. Available: https://doi.org/10.5281/zenodo.6549559
-
[15]
mir eval: A transparent implementation of common mir metrics,
C. Raffel, B. McFee, E. J. Humphrey, J. Salamon, O. Nieto, D. Liang, and D. P. W. Ellis, “mir eval: A transparent implementation of common mir metrics,” inProceedings of the 15th International Conference on Music Information Retrieval (ISMIR). Taipei, Taiwan: International Society for Music Information Retrieval, November 2014, pp. 367–372. Fig. 8. A conc...
2014
-
[16]
librosa/librosa: 0.11.0,
B. McFee, M. McVicar, D. Faronbiet al., “librosa/librosa: 0.11.0,” Mar
-
[17]
librosa/librosa: 0.11.0 , month=
[Online]. Available: https://doi.org/10.5281/zenodo.15006942
-
[18]
mel-cepstral-distance,
S. Taubert and J. Sternkopf, “mel-cepstral-distance,” Apr. 2025. [Online]. Available: https://github.com/stefantaubert/mel-cepstral-distance
2025
-
[19]
pystoi: Python implementation of the short-time objective intelligibility (stoi) metric,
M. Pariente and contributors, “pystoi: Python implementation of the short-time objective intelligibility (stoi) metric,” 2020. [Online]. Available: https://github.com/mpariente/pystoi
2020
-
[20]
IEEE/ACM Transactions on Audio, Speech, and Language Processing , author =
J. Jensen, C. H. Taal, J. Jensen, and C. H. Taal, “An algorithm for predicting the intelligibility of speech masked by modulated noise maskers,”IEEE/ACM Trans. Audio, Speech and Lang. Proc., vol. 24, no. 11, p. 2009–2022, Nov. 2016. [Online]. Available: https://doi.org/10.1109/TASLP.2016.2585878
-
[21]
ITU-T Recommendation P.863: Perceptual Objective Listening Quality Analysis (POLQA),
International Telecommunication Union, “ITU-T Recommendation P.863: Perceptual Objective Listening Quality Analysis (POLQA),” International Telecommunication Union, Tech. Rep., 2011. [Online]. Available: https://www.itu.int/rec/T-REC-P.863
2011
-
[22]
Interactive supercomputing on 40,000 cores for machine learning and data analysis,
A. Reuther, J. Kepner, C. Byun, S. Samsi, W. Arcand, D. Bestor, B. Bergeron, V . Gadepally, M. Houle, M. Hubbell, M. Jones, A. Klein, L. Milechin, J. Mullen, A. Prout, A. Rosa, C. Yee, and P. Michaleas, “Interactive supercomputing on 40,000 cores for machine learning and data analysis,” in2018 IEEE High Performance extreme Computing Conference (HPEC), 201...
2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.