pith. sign in

arxiv: 2606.23210 · v1 · pith:FCOENCDGnew · submitted 2026-06-22 · 💻 cs.LG · cs.DC· eess.SP

Efficient Network Inference via Hardware-Aware Architecture Search, Model Pruning & Quantization

Pith reviewed 2026-06-26 09:19 UTC · model grok-4.3

classification 💻 cs.LG cs.DCeess.SP
keywords GNSS interference monitoringmodel pruningquantizationneural architecture searchembedded deploymentMCUNetreal-time inferencedeep neural networks
0
0 comments X

The pith

Combining structured pruning, post-training quantization, and hardware-aware NAS produces compact models suitable for real-time GNSS interference monitoring on embedded hardware.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Embedded GNSS interference monitoring must handle large volumes of raw IQ samples in real time, yet requires expressive neural networks for reliable classification and characterization under varied conditions. This creates a direct conflict between model capability and the memory and speed limits of devices such as the iMXRT1062 MCU. The paper starts from the MCUNet baseline and applies iterative structured pruning, post-training static quantization, and hardware-aware zero-shot NAS to shrink model size, computational cost, and memory footprint. Experiments on a GNSS interference dataset confirm that the resulting networks continue to support both classification and generalized characterization tasks. The work supplies concrete guidance for deploying such models on the listed embedded platforms.

Core claim

Starting from MCUNet as baseline, iterative structured pruning, post-training static quantization, and hardware-aware zero-shot NAS jointly reduce model size, computational complexity, and memory usage while preserving performance on GNSS interference classification and generalized characterization tasks, as measured on a dedicated dataset and verified across iMXRT1062 MCU, Raspberry Pi Zero 2W, and Raspberry Pi 5.

What carries the argument

Iterative structured pruning combined with post-training static quantization and hardware-aware zero-shot NAS applied to the MCUNet baseline.

If this is right

  • The compressed models satisfy the memory and latency constraints of the iMXRT1062 MCU and the two Raspberry Pi platforms for real-time IQ-sample processing.
  • Both classification accuracy and generalized characterization quality stay within acceptable bounds after compression.
  • The combination of the three techniques supplies a repeatable recipe for producing deployable GNSS-monitoring networks.
  • Hardware-aware NAS guides architecture choices that further improve fit to the target embedded devices.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same compression sequence could be tested on other real-time signal-classification problems that run on similar microcontrollers.
  • If hardware-specific latency measurements were collected during NAS, the search could be tightened to meet explicit timing budgets.
  • Repeating the pipeline on datasets with more diverse interference types would test whether the observed size-performance trade-off generalizes.

Load-bearing premise

Predictive performance on classification and generalized characterization tasks remains acceptable after pruning, quantization, and architecture search.

What would settle it

A measured drop in classification accuracy below the level required for reliable interference detection on the GNSS dataset after the full compression pipeline is applied.

Figures

Figures reproduced from arXiv: 2606.23210 by Axel Plinge, Felix Ott, Lucas Heublein, Mark Deutel.

Figure 2
Figure 2. Figure 2: Element-wise pruning (left) removes individual [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 1
Figure 1. Figure 1: Overview of our methodology. The left branch ap [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 4
Figure 4. Figure 4: Number of parameters for each network layer for [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Pareto front identified by PrototypeNAS, i.e., computational cost in FLOPs against four zero-shot proxies (MeCo, [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of model accuracy and their corresponding [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of compute FLOPs, accuracy for floating [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
read the original abstract

Embedded global navigation satellite system (GNSS) interference monitoring requires fast and memory-efficient inference to process large volumes of raw in-phase and quadrature (IQ) samples in real time. At the same time, increasingly expressive deep neural networks (DNNs) are needed for robust interference classification and characterization across diverse signal conditions. This creates a fundamental tension between predictive performance and deployability on resource-constrained hardware. In this paper, we investigate efficient network inference for GNSS interference characterization using iterative structured pruning, post-training static quantization, and hardware-aware zero-shot neural architecture search (NAS). Starting from MCUNet as a compact baseline, we analyze how model compression and automated architecture optimization affect model size, computational complexity, and memory usage while maintaining task performance. Experiments on a GNSS interference dataset, covering both classification and generalized characterization, show the benefits of combining compression and hardware-aware design for embedded deployment. Our results provide practical guidance for developing compact machine learning (ML) models for real-time GNSS interference monitoring on embedded platforms (iMXRT1062 MCU, Raspberry Pi Zero 2W, and Raspberry Pi 5).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper claims that combining iterative structured pruning, post-training static quantization, and hardware-aware zero-shot NAS (starting from MCUNet) yields deployable models for GNSS interference classification and generalized characterization tasks. Experiments on a GNSS interference dataset show benefits for embedded deployment on platforms including the iMXRT1062 MCU, Raspberry Pi Zero 2W, and Raspberry Pi 5, while maintaining acceptable task performance and providing practical guidance for real-time monitoring.

Significance. If the results hold, this work offers practical guidance on balancing model expressiveness with deployability for embedded GNSS applications, where real-time processing of large IQ sample volumes is required. The multi-platform evaluation and focus on standard compression plus NAS techniques could inform similar efforts in other resource-constrained sensing domains.

major comments (1)
  1. [Abstract] Abstract: the claim that performance is maintained supplies no quantitative metrics, ablation results, or error analysis, so the central claim that task performance remains acceptable after pruning, quantization, and NAS cannot be verified from the available text.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed review and constructive comment. We address the major concern point-by-point below and commit to revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that performance is maintained supplies no quantitative metrics, ablation results, or error analysis, so the central claim that task performance remains acceptable after pruning, quantization, and NAS cannot be verified from the available text.

    Authors: We agree that the abstract should provide quantitative support for the claim of maintained performance. The full manuscript contains detailed results, including accuracy metrics before/after compression (e.g., classification accuracy, F1 scores), model size reductions (in KB), MACs, memory footprint, and latency measurements across the three platforms, along with ablation studies on pruning ratios, quantization bits, and NAS variants, plus error analysis via confusion matrices and per-class performance. To address the comment, we will revise the abstract to include key quantitative highlights such as 'reducing model size by X% with <Y% accuracy drop' while preserving the overall narrative. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper describes an experimental pipeline applying iterative structured pruning, post-training static quantization, and hardware-aware zero-shot NAS (starting from MCUNet) to a GNSS interference dataset. No derivation chain, fitted parameters renamed as predictions, self-definitional equations, or load-bearing self-citations are present. Claims rest on empirical measurements of size, complexity, memory, and task performance, which are externally falsifiable and independent of any internal reduction to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only abstract available; no free parameters, axioms, or invented entities can be extracted.

pith-pipeline@v0.9.1-grok · 5736 in / 873 out tokens · 21808 ms · 2026-06-26T09:19:26.819943+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references · 8 canonical work pages · 4 internal anchors

  1. [1]

    Global Navigation Satellite Systems as Critical Infrastructure: A Cross-Sectoral Impact Assessment of Service Interrup- tions in Europe,

    I. Prezelj and J. Juvan, “Global Navigation Satellite Systems as Critical Infrastructure: A Cross-Sectoral Impact Assessment of Service Interrup- tions in Europe,” inProgress in Disaster Science, Jan. 2026

  2. [2]

    GenAI for Energy-Efficient and Interference-Aware Compressed Sens- ing of GNSS Signals on a Google Edge TPU,

    T. Wegner, L. Heublein, T. Feigl, F. Ott, C. Mutschler, and A. R ¨ugamer, “GenAI for Energy-Efficient and Interference-Aware Compressed Sens- ing of GNSS Signals on a Google Edge TPU,” inIEEE/ION Position, Location and Navigation Symposium (PLANS), Salt Lake City, UT, May 2025, pp. 1149–1160

  3. [3]

    GNSS Interference Classification Using Federated Reservoir Computing,

    Z. Ye, Y . Gao, X. Liu, Y . Xiao, M. Xiao, and S. Zammit, “GNSS Interference Classification Using Federated Reservoir Computing,” in IEEE Intl. Conf. on Communication Technology (ICCT), Chengdu, China, Oct. 2024

  4. [4]

    Federated Learning of Jamming Classifiers: From Global to Personalized Models,

    P. Wu, H. Calatrava, T. Imbiriba, and P. Closas, “Federated Learning of Jamming Classifiers: From Global to Personalized Models,” in NAVIGATION: Journal of the Institute of Navigation, Mar. 2025

  5. [5]

    Dictionary- Based Contrastive Learning for GNSS Jamming Detection,

    Z. Hussain, A. Majal, A. H. Chughtai, and T. Nadeem, “Dictionary- Based Contrastive Learning for GNSS Jamming Detection,” inarXiv preprint arXiv:2512.07512, Dec. 2025

  6. [6]

    V AE-based Feature Disentanglement for Data Augmentation and Compression in Generalized GNSS Interference Classification,

    L. Heublein, S. Kocher, T. Feigl, A. R ¨ugamer, C. Mutschler, and F. Ott, “V AE-based Feature Disentanglement for Data Augmentation and Compression in Generalized GNSS Interference Classification,” in IEEE Intl. Conf. on Localization and GNSS (ICL-GNSS), Rome, Italy, Jun. 2025

  7. [7]

    Variational and Generative Models with Quantization for Disentanglement and Compressed Sensing of GNSS Spectrograms,

    L. Heublein, T. Feigl, A. R ¨ugamer, C. Mutschler, and F. Ott, “Variational and Generative Models with Quantization for Disentanglement and Compressed Sensing of GNSS Spectrograms,” inIEEE Journal of Indoor and Seamless Positioning and Navigation (J-ISPIN), vol. 4, Jan. 2026, pp. 65–81

  8. [8]

    GAC-KAN: An Ultra- Lightweight GNSS Interference Classifier for GenAI-Powered Con- sumer Edge Devices,

    Z. Zeng, K. Wang, Z. Zhang, and Y . Xiu, “GAC-KAN: An Ultra- Lightweight GNSS Interference Classifier for GenAI-Powered Con- sumer Edge Devices,” inarXiv preprint arXiv:2602.11186, Jan. 2026

  9. [9]

    Towards a Faster GNSS Interference Classification: A GRU-Based Approach Using Spectrograms,

    I. E. Mehr, G. Caputo, D. Salza, M. Fantino, and F. Dovis, “Towards a Faster GNSS Interference Classification: A GRU-Based Approach Using Spectrograms,” inIEEE/ION Position, Location and Navigation Symposium (PLANS), Salt Lake City, UT, May 2025

  10. [10]

    A Foundation Model for Wireless Technology Recognition Using IQ Timeseries,

    M. Cheraghinia, E. D. Poorter, J. Fontaine, M. Debbah, and A. Shahid, “A Foundation Model for Wireless Technology Recognition Using IQ Timeseries,” inIEEE Open Journal of the Communications Society (OJCOMS), vol. 6, Nov. 2025, pp. 9879–9896

  11. [11]

    EMind: A Foundation Model for Multi-Task Electromagnetic Signals Understanding,

    L. Luo, W. Gui, Y . Liu, F. Wang, Z. Zhuang, Y . Zhang, Z. Guo, Q. Zhao, Z. Ma, H. He, M. Liu, Z. Cong, X. Liu, J. Li, X. Qiu, W. Xie, Y . Sun, and M. Sun, “EMind: A Foundation Model for Multi-Task Electromagnetic Signals Understanding,” inarXiv preprint arXiv:2508.18785, Aug. 2025

  12. [12]

    MCUNet: Tiny Deep Learning on IoT Devices,

    J. Lin, W.-M. Chen, Y . Lin, C. Gan, and S. Han, “MCUNet: Tiny Deep Learning on IoT Devices,” inAdvances in Neural Information Processing Systems (NIPS), vol. 33(982), Dec. 2020, pp. 11 711–11 722

  13. [13]

    MLPerf Tiny Benchmark,

    C. Banbury, V . J. Reddi, P. Torelli, J. Holleman, N. Jeffries, C. Kiraly, P. Montino, D. Kanter, S. Ahmed, D. Pau, U. Thakker, A. Torrini, P. Warden, J. Cordaro, G. D. Guglielmo, J. Duarte, S. Gibellini, V . Parekh, H. Tran, N. Tran, N. Wenxu, and X. Xuesong, “MLPerf Tiny Benchmark,” inarXiv preprint arXiv:2106.07597, Aug. 2021

  14. [14]

    TinyML-Enabled Frugal Smart Objects: Challenges and Opportunities,

    R. Sanchez-Iborra and A. F. Skarmeta, “TinyML-Enabled Frugal Smart Objects: Challenges and Opportunities,” inIEEE Circuits and Systems Magazine (MCAS), vol. 20(3), Aug. 2020, pp. 4–18

  15. [15]

    Federated Optimization: Distributed Machine Learning for On-Device Intelligence

    J. Kone ˘cn´y, H. B. McMahan, D. Ramage, and P. Richt ´arik, “Federated Optimization: Distributed Machine Learning for On-Device Intelli- gence,” inarXiv preprint arXiv:1610.02527, Oct. 2016

  16. [16]

    TinyTL: Reduce Memory, Not Parameters for Efficient On-Device Learning,

    H. Cai, C. Gan, L. Zhu, and S. Han, “TinyTL: Reduce Memory, Not Parameters for Efficient On-Device Learning,” inAdvanced in Neural Information Processing Systems (NIPS), 2020

  17. [17]

    Low-Energy On-Device Personalization for MCUs,

    Y . Huang, R. Aloufi, X. Cadet, Y . Zhao, P. Barnaghi, and H. Haddai, “Low-Energy On-Device Personalization for MCUs,” inIEEE/ACM Symposium on Edge Computing (SEC), Rome, Italy, Dec. 2024

  18. [18]

    Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

    S. Han, H. Mao, and W. J. Dally, “Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding,” inarXiv preprint arXiv:1510.00149, Feb. 2016

  19. [19]

    Predict- ing Parameters in Deep Learning,

    M. Denil, B. Shakibi, L. Dinh, M. Ranzato, and N. D. Freitas, “Predict- ing Parameters in Deep Learning,” inAdvances in Neural Information Processing Systems (NIPS), vol. 26, Dec. 2013, pp. 2148–2156

  20. [20]

    Neural Architecture Search Without Training,

    J. Mellor, J. Turner, A. Storkey, and E. J. Crowley, “Neural Architecture Search Without Training,” inIntl. Conf. on Machine Learning (ICML), 2021, pp. 7588–7598

  21. [21]

    Once-for-All: Train One Network and Specialize It for Efficient Deployment,

    H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han, “Once-for-All: Train One Network and Specialize It for Efficient Deployment,” inIntl. Conf. on Learning Representations (ICLR), 2020

  22. [22]

    LiteJam: A Lightweight Deep Learning Architecture for Real-Time GNSS Inter- ference Detection and Characterization in UA Vs,

    Y . Chen, J. Wang, Z. Fang, T. Ni, J. Geng, and W. Ge, “LiteJam: A Lightweight Deep Learning Architecture for Real-Time GNSS Inter- ference Detection and Characterization in UA Vs,” inIEEE Internet of Things Journal (JIOT), vol. 13(7), Jan. 2026, pp. 13 472–13 485

  23. [23]

    Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference,

    B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, “Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference,” inIEEE/CVF Intl. Conf. on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, Jun. 2018, pp. 2704–2713

  24. [24]

    Energy-Efficient Deployment of Deep Learning Applications on Cortex-M Based Mi- crocontrollers Using Deep Compression,

    M. Deutel, P. Woller, C. Mutschler, and J. Teich, “Energy-Efficient Deployment of Deep Learning Applications on Cortex-M Based Mi- crocontrollers Using Deep Compression,” inWorkshop on Methods and Description Languages for Modelling and Verification of Circuits and Systems (MBMV), Freiburg, Germany, Mar. 2023, pp. 1–12

  25. [25]

    Combining Multi- objective Bayesian Optimization with Reinforcement Learning for TinyML,

    M. Deutel, G. Kontes, C. Mutschler, and J. Teich, “Combining Multi- objective Bayesian Optimization with Reinforcement Learning for TinyML,” vol. 5, no. 3. ACM New York, NY , 2025, pp. 1–21

  26. [26]

    PrototypeNAS: Rapid Design of Deep Neural Networks for Microcontroller Units

    M. Deutel, S. Geis, and A. Plinge, “PrototypeNAS: Rapid Design of Deep Neural Networks for Microcontroller Units,” inarXiv preprint arXiv:2603.15106, Mar. 2026

  27. [27]

    Evolving Comprehensive Proxies for Zero-Shot Neural Architecture Search,

    J. Huang, B. Xue, Y . Sun, and M. Zhang, “Evolving Comprehensive Proxies for Zero-Shot Neural Architecture Search,” inGenetic and Evolutionary Computation Conf. (GECCO), Jul. 2025, pp. 1246–1254

  28. [28]

    MeCo: Zero-Shot NAS with One Data and Single Forward Pass via Minimum Eigenvalue of Correlation,

    T. Jiang, H. Wang, and R. Bie, “MeCo: Zero-Shot NAS with One Data and Single Forward Pass via Minimum Eigenvalue of Correlation,” in Advances in Neural Information Processing Systems (NIPS), Sep. 2023

  29. [29]

    SNIP: Single-Shot Network Pruning Based on Connection Sensitivity,

    N. Lee, T. Ajanthan, and P. Torr, “SNIP: Single-Shot Network Pruning Based on Connection Sensitivity,” inIntl. Conf. on Learning Represen- tations (ICLR), Dec. 2018

  30. [30]

    ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients,

    G. Li, Y . Yang, K. Bhardwaj, and R. Marculescu, “ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients,” inIntl. Conf. on Learning Representations (ICLR), 2023

  31. [31]

    Mo- bileNetV2: Inverted Residuals and Linear Bottlenecks,

    M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mo- bileNetV2: Inverted Residuals and Linear Bottlenecks,” inIEEE/CVF Intl. Conf. on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 4510–4520

  32. [32]

    Concept-Level Debugging of Part-Prototype Networks,

    A. Bontempelli, S. Teso, K. Tentori, F. Giunchiglia, and A. Passerini, “Concept-Level Debugging of Part-Prototype Networks,” inIntl. Conf. on Learning Representations (ICLR), Feb. 2023

  33. [33]

    Deep Residual Learning for Image Recognition,

    K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” inIEEE/CVF Intl. Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV , Jun. 2016

  34. [34]

    SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size

    F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “SqueezeNet: AlexNet-Level Accuracy With 50x Fewer Parameters and<0.5MB Model Size,” inarXiv preprint arXiv:1602.07360, Nov. 2016