Efficient Network Inference via Hardware-Aware Architecture Search, Model Pruning & Quantization
Pith reviewed 2026-06-26 09:19 UTC · model grok-4.3
The pith
Combining structured pruning, post-training quantization, and hardware-aware NAS produces compact models suitable for real-time GNSS interference monitoring on embedded hardware.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Starting from MCUNet as baseline, iterative structured pruning, post-training static quantization, and hardware-aware zero-shot NAS jointly reduce model size, computational complexity, and memory usage while preserving performance on GNSS interference classification and generalized characterization tasks, as measured on a dedicated dataset and verified across iMXRT1062 MCU, Raspberry Pi Zero 2W, and Raspberry Pi 5.
What carries the argument
Iterative structured pruning combined with post-training static quantization and hardware-aware zero-shot NAS applied to the MCUNet baseline.
If this is right
- The compressed models satisfy the memory and latency constraints of the iMXRT1062 MCU and the two Raspberry Pi platforms for real-time IQ-sample processing.
- Both classification accuracy and generalized characterization quality stay within acceptable bounds after compression.
- The combination of the three techniques supplies a repeatable recipe for producing deployable GNSS-monitoring networks.
- Hardware-aware NAS guides architecture choices that further improve fit to the target embedded devices.
Where Pith is reading between the lines
- The same compression sequence could be tested on other real-time signal-classification problems that run on similar microcontrollers.
- If hardware-specific latency measurements were collected during NAS, the search could be tightened to meet explicit timing budgets.
- Repeating the pipeline on datasets with more diverse interference types would test whether the observed size-performance trade-off generalizes.
Load-bearing premise
Predictive performance on classification and generalized characterization tasks remains acceptable after pruning, quantization, and architecture search.
What would settle it
A measured drop in classification accuracy below the level required for reliable interference detection on the GNSS dataset after the full compression pipeline is applied.
Figures
read the original abstract
Embedded global navigation satellite system (GNSS) interference monitoring requires fast and memory-efficient inference to process large volumes of raw in-phase and quadrature (IQ) samples in real time. At the same time, increasingly expressive deep neural networks (DNNs) are needed for robust interference classification and characterization across diverse signal conditions. This creates a fundamental tension between predictive performance and deployability on resource-constrained hardware. In this paper, we investigate efficient network inference for GNSS interference characterization using iterative structured pruning, post-training static quantization, and hardware-aware zero-shot neural architecture search (NAS). Starting from MCUNet as a compact baseline, we analyze how model compression and automated architecture optimization affect model size, computational complexity, and memory usage while maintaining task performance. Experiments on a GNSS interference dataset, covering both classification and generalized characterization, show the benefits of combining compression and hardware-aware design for embedded deployment. Our results provide practical guidance for developing compact machine learning (ML) models for real-time GNSS interference monitoring on embedded platforms (iMXRT1062 MCU, Raspberry Pi Zero 2W, and Raspberry Pi 5).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that combining iterative structured pruning, post-training static quantization, and hardware-aware zero-shot NAS (starting from MCUNet) yields deployable models for GNSS interference classification and generalized characterization tasks. Experiments on a GNSS interference dataset show benefits for embedded deployment on platforms including the iMXRT1062 MCU, Raspberry Pi Zero 2W, and Raspberry Pi 5, while maintaining acceptable task performance and providing practical guidance for real-time monitoring.
Significance. If the results hold, this work offers practical guidance on balancing model expressiveness with deployability for embedded GNSS applications, where real-time processing of large IQ sample volumes is required. The multi-platform evaluation and focus on standard compression plus NAS techniques could inform similar efforts in other resource-constrained sensing domains.
major comments (1)
- [Abstract] Abstract: the claim that performance is maintained supplies no quantitative metrics, ablation results, or error analysis, so the central claim that task performance remains acceptable after pruning, quantization, and NAS cannot be verified from the available text.
Simulated Author's Rebuttal
We thank the referee for the detailed review and constructive comment. We address the major concern point-by-point below and commit to revisions where appropriate.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that performance is maintained supplies no quantitative metrics, ablation results, or error analysis, so the central claim that task performance remains acceptable after pruning, quantization, and NAS cannot be verified from the available text.
Authors: We agree that the abstract should provide quantitative support for the claim of maintained performance. The full manuscript contains detailed results, including accuracy metrics before/after compression (e.g., classification accuracy, F1 scores), model size reductions (in KB), MACs, memory footprint, and latency measurements across the three platforms, along with ablation studies on pruning ratios, quantization bits, and NAS variants, plus error analysis via confusion matrices and per-class performance. To address the comment, we will revise the abstract to include key quantitative highlights such as 'reducing model size by X% with <Y% accuracy drop' while preserving the overall narrative. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper describes an experimental pipeline applying iterative structured pruning, post-training static quantization, and hardware-aware zero-shot NAS (starting from MCUNet) to a GNSS interference dataset. No derivation chain, fitted parameters renamed as predictions, self-definitional equations, or load-bearing self-citations are present. Claims rest on empirical measurements of size, complexity, memory, and task performance, which are externally falsifiable and independent of any internal reduction to inputs by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Global Navigation Satellite Systems as Critical Infrastructure: A Cross-Sectoral Impact Assessment of Service Interrup- tions in Europe,
I. Prezelj and J. Juvan, “Global Navigation Satellite Systems as Critical Infrastructure: A Cross-Sectoral Impact Assessment of Service Interrup- tions in Europe,” inProgress in Disaster Science, Jan. 2026
2026
-
[2]
GenAI for Energy-Efficient and Interference-Aware Compressed Sens- ing of GNSS Signals on a Google Edge TPU,
T. Wegner, L. Heublein, T. Feigl, F. Ott, C. Mutschler, and A. R ¨ugamer, “GenAI for Energy-Efficient and Interference-Aware Compressed Sens- ing of GNSS Signals on a Google Edge TPU,” inIEEE/ION Position, Location and Navigation Symposium (PLANS), Salt Lake City, UT, May 2025, pp. 1149–1160
2025
-
[3]
GNSS Interference Classification Using Federated Reservoir Computing,
Z. Ye, Y . Gao, X. Liu, Y . Xiao, M. Xiao, and S. Zammit, “GNSS Interference Classification Using Federated Reservoir Computing,” in IEEE Intl. Conf. on Communication Technology (ICCT), Chengdu, China, Oct. 2024
2024
-
[4]
Federated Learning of Jamming Classifiers: From Global to Personalized Models,
P. Wu, H. Calatrava, T. Imbiriba, and P. Closas, “Federated Learning of Jamming Classifiers: From Global to Personalized Models,” in NAVIGATION: Journal of the Institute of Navigation, Mar. 2025
2025
-
[5]
Dictionary- Based Contrastive Learning for GNSS Jamming Detection,
Z. Hussain, A. Majal, A. H. Chughtai, and T. Nadeem, “Dictionary- Based Contrastive Learning for GNSS Jamming Detection,” inarXiv preprint arXiv:2512.07512, Dec. 2025
-
[6]
V AE-based Feature Disentanglement for Data Augmentation and Compression in Generalized GNSS Interference Classification,
L. Heublein, S. Kocher, T. Feigl, A. R ¨ugamer, C. Mutschler, and F. Ott, “V AE-based Feature Disentanglement for Data Augmentation and Compression in Generalized GNSS Interference Classification,” in IEEE Intl. Conf. on Localization and GNSS (ICL-GNSS), Rome, Italy, Jun. 2025
2025
-
[7]
Variational and Generative Models with Quantization for Disentanglement and Compressed Sensing of GNSS Spectrograms,
L. Heublein, T. Feigl, A. R ¨ugamer, C. Mutschler, and F. Ott, “Variational and Generative Models with Quantization for Disentanglement and Compressed Sensing of GNSS Spectrograms,” inIEEE Journal of Indoor and Seamless Positioning and Navigation (J-ISPIN), vol. 4, Jan. 2026, pp. 65–81
2026
-
[8]
Z. Zeng, K. Wang, Z. Zhang, and Y . Xiu, “GAC-KAN: An Ultra- Lightweight GNSS Interference Classifier for GenAI-Powered Con- sumer Edge Devices,” inarXiv preprint arXiv:2602.11186, Jan. 2026
-
[9]
Towards a Faster GNSS Interference Classification: A GRU-Based Approach Using Spectrograms,
I. E. Mehr, G. Caputo, D. Salza, M. Fantino, and F. Dovis, “Towards a Faster GNSS Interference Classification: A GRU-Based Approach Using Spectrograms,” inIEEE/ION Position, Location and Navigation Symposium (PLANS), Salt Lake City, UT, May 2025
2025
-
[10]
A Foundation Model for Wireless Technology Recognition Using IQ Timeseries,
M. Cheraghinia, E. D. Poorter, J. Fontaine, M. Debbah, and A. Shahid, “A Foundation Model for Wireless Technology Recognition Using IQ Timeseries,” inIEEE Open Journal of the Communications Society (OJCOMS), vol. 6, Nov. 2025, pp. 9879–9896
2025
-
[11]
EMind: A Foundation Model for Multi-Task Electromagnetic Signals Understanding,
L. Luo, W. Gui, Y . Liu, F. Wang, Z. Zhuang, Y . Zhang, Z. Guo, Q. Zhao, Z. Ma, H. He, M. Liu, Z. Cong, X. Liu, J. Li, X. Qiu, W. Xie, Y . Sun, and M. Sun, “EMind: A Foundation Model for Multi-Task Electromagnetic Signals Understanding,” inarXiv preprint arXiv:2508.18785, Aug. 2025
-
[12]
MCUNet: Tiny Deep Learning on IoT Devices,
J. Lin, W.-M. Chen, Y . Lin, C. Gan, and S. Han, “MCUNet: Tiny Deep Learning on IoT Devices,” inAdvances in Neural Information Processing Systems (NIPS), vol. 33(982), Dec. 2020, pp. 11 711–11 722
2020
-
[13]
C. Banbury, V . J. Reddi, P. Torelli, J. Holleman, N. Jeffries, C. Kiraly, P. Montino, D. Kanter, S. Ahmed, D. Pau, U. Thakker, A. Torrini, P. Warden, J. Cordaro, G. D. Guglielmo, J. Duarte, S. Gibellini, V . Parekh, H. Tran, N. Tran, N. Wenxu, and X. Xuesong, “MLPerf Tiny Benchmark,” inarXiv preprint arXiv:2106.07597, Aug. 2021
-
[14]
TinyML-Enabled Frugal Smart Objects: Challenges and Opportunities,
R. Sanchez-Iborra and A. F. Skarmeta, “TinyML-Enabled Frugal Smart Objects: Challenges and Opportunities,” inIEEE Circuits and Systems Magazine (MCAS), vol. 20(3), Aug. 2020, pp. 4–18
2020
-
[15]
Federated Optimization: Distributed Machine Learning for On-Device Intelligence
J. Kone ˘cn´y, H. B. McMahan, D. Ramage, and P. Richt ´arik, “Federated Optimization: Distributed Machine Learning for On-Device Intelli- gence,” inarXiv preprint arXiv:1610.02527, Oct. 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[16]
TinyTL: Reduce Memory, Not Parameters for Efficient On-Device Learning,
H. Cai, C. Gan, L. Zhu, and S. Han, “TinyTL: Reduce Memory, Not Parameters for Efficient On-Device Learning,” inAdvanced in Neural Information Processing Systems (NIPS), 2020
2020
-
[17]
Low-Energy On-Device Personalization for MCUs,
Y . Huang, R. Aloufi, X. Cadet, Y . Zhao, P. Barnaghi, and H. Haddai, “Low-Energy On-Device Personalization for MCUs,” inIEEE/ACM Symposium on Edge Computing (SEC), Rome, Italy, Dec. 2024
2024
-
[18]
S. Han, H. Mao, and W. J. Dally, “Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding,” inarXiv preprint arXiv:1510.00149, Feb. 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[19]
Predict- ing Parameters in Deep Learning,
M. Denil, B. Shakibi, L. Dinh, M. Ranzato, and N. D. Freitas, “Predict- ing Parameters in Deep Learning,” inAdvances in Neural Information Processing Systems (NIPS), vol. 26, Dec. 2013, pp. 2148–2156
2013
-
[20]
Neural Architecture Search Without Training,
J. Mellor, J. Turner, A. Storkey, and E. J. Crowley, “Neural Architecture Search Without Training,” inIntl. Conf. on Machine Learning (ICML), 2021, pp. 7588–7598
2021
-
[21]
Once-for-All: Train One Network and Specialize It for Efficient Deployment,
H. Cai, C. Gan, T. Wang, Z. Zhang, and S. Han, “Once-for-All: Train One Network and Specialize It for Efficient Deployment,” inIntl. Conf. on Learning Representations (ICLR), 2020
2020
-
[22]
LiteJam: A Lightweight Deep Learning Architecture for Real-Time GNSS Inter- ference Detection and Characterization in UA Vs,
Y . Chen, J. Wang, Z. Fang, T. Ni, J. Geng, and W. Ge, “LiteJam: A Lightweight Deep Learning Architecture for Real-Time GNSS Inter- ference Detection and Characterization in UA Vs,” inIEEE Internet of Things Journal (JIOT), vol. 13(7), Jan. 2026, pp. 13 472–13 485
2026
-
[23]
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference,
B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, “Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference,” inIEEE/CVF Intl. Conf. on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, Jun. 2018, pp. 2704–2713
2018
-
[24]
Energy-Efficient Deployment of Deep Learning Applications on Cortex-M Based Mi- crocontrollers Using Deep Compression,
M. Deutel, P. Woller, C. Mutschler, and J. Teich, “Energy-Efficient Deployment of Deep Learning Applications on Cortex-M Based Mi- crocontrollers Using Deep Compression,” inWorkshop on Methods and Description Languages for Modelling and Verification of Circuits and Systems (MBMV), Freiburg, Germany, Mar. 2023, pp. 1–12
2023
-
[25]
Combining Multi- objective Bayesian Optimization with Reinforcement Learning for TinyML,
M. Deutel, G. Kontes, C. Mutschler, and J. Teich, “Combining Multi- objective Bayesian Optimization with Reinforcement Learning for TinyML,” vol. 5, no. 3. ACM New York, NY , 2025, pp. 1–21
2025
-
[26]
PrototypeNAS: Rapid Design of Deep Neural Networks for Microcontroller Units
M. Deutel, S. Geis, and A. Plinge, “PrototypeNAS: Rapid Design of Deep Neural Networks for Microcontroller Units,” inarXiv preprint arXiv:2603.15106, Mar. 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[27]
Evolving Comprehensive Proxies for Zero-Shot Neural Architecture Search,
J. Huang, B. Xue, Y . Sun, and M. Zhang, “Evolving Comprehensive Proxies for Zero-Shot Neural Architecture Search,” inGenetic and Evolutionary Computation Conf. (GECCO), Jul. 2025, pp. 1246–1254
2025
-
[28]
MeCo: Zero-Shot NAS with One Data and Single Forward Pass via Minimum Eigenvalue of Correlation,
T. Jiang, H. Wang, and R. Bie, “MeCo: Zero-Shot NAS with One Data and Single Forward Pass via Minimum Eigenvalue of Correlation,” in Advances in Neural Information Processing Systems (NIPS), Sep. 2023
2023
-
[29]
SNIP: Single-Shot Network Pruning Based on Connection Sensitivity,
N. Lee, T. Ajanthan, and P. Torr, “SNIP: Single-Shot Network Pruning Based on Connection Sensitivity,” inIntl. Conf. on Learning Represen- tations (ICLR), Dec. 2018
2018
-
[30]
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients,
G. Li, Y . Yang, K. Bhardwaj, and R. Marculescu, “ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients,” inIntl. Conf. on Learning Representations (ICLR), 2023
2023
-
[31]
Mo- bileNetV2: Inverted Residuals and Linear Bottlenecks,
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mo- bileNetV2: Inverted Residuals and Linear Bottlenecks,” inIEEE/CVF Intl. Conf. on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 4510–4520
2018
-
[32]
Concept-Level Debugging of Part-Prototype Networks,
A. Bontempelli, S. Teso, K. Tentori, F. Giunchiglia, and A. Passerini, “Concept-Level Debugging of Part-Prototype Networks,” inIntl. Conf. on Learning Representations (ICLR), Feb. 2023
2023
-
[33]
Deep Residual Learning for Image Recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” inIEEE/CVF Intl. Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV , Jun. 2016
2016
-
[34]
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size
F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “SqueezeNet: AlexNet-Level Accuracy With 50x Fewer Parameters and<0.5MB Model Size,” inarXiv preprint arXiv:1602.07360, Nov. 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.