Deployment-Aligned Low-Precision Neural Architecture Search for Spaceborne Edge AI
Pith reviewed 2026-05-08 04:31 UTC · model grok-4.3
The pith
Integrating FP16 constraints into hardware-aware NAS recovers two-thirds of the accuracy lost when deploying models on low-precision edge VPUs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By integrating deployment-aligned low-precision training directly into hardware-aware NAS, candidate architectures are exposed to FP16 numerical constraints during fine-tuning and evaluation. This enables joint optimization of architectural efficiency and numerical robustness without modifying the search space or evolutionary strategy. Evaluated on vessel segmentation for spaceborne maritime monitoring targeting the Intel Movidius Myriad X VPU, the approach achieves 0.826 mIoU on-device for an architecture of 95,791 parameters, compared with 0.78 mIoU after post-training precision conversion from a full-precision baseline of 0.85 mIoU, thereby recovering approximately two-thirds of the gap.
What carries the argument
Deployment-aligned low-precision training, the mechanism that exposes every NAS candidate to FP16 constraints during both fine-tuning and evaluation so that numerical robustness is optimized together with architectural efficiency.
If this is right
- On-device accuracy rises for an unchanged model size and architecture.
- The accuracy gap between full-precision NAS and low-precision deployment shrinks without extra parameters.
- No alterations to the evolutionary search strategy or search space are needed.
- The same compact network meets stricter latency and accuracy targets on resource-constrained VPUs.
Where Pith is reading between the lines
- The same alignment technique could be tested on other low-precision formats such as INT8 or on different edge accelerators.
- Space missions that cannot perform post-deployment retraining would benefit from earlier numerical robustness in the search.
- Hardware-aware NAS frameworks may need to treat numerical constraints as first-class hardware metrics rather than post-processing steps.
Load-bearing premise
That exposing architectures to FP16 constraints during NAS fine-tuning and evaluation accurately predicts and improves real deployment behavior on the target VPU without introducing search biases.
What would settle it
Deploy the architecture selected by the low-precision NAS on the actual Myriad X VPU and measure its mIoU; if the result falls to or below the 0.78 mIoU obtained by post-training conversion of a full-precision NAS model, the central claim does not hold.
Figures
read the original abstract
Designing deep networks that meet strict latency and accuracy constraints on edge accelerators increasingly relies on hardware-aware optimization, including neural architecture search (NAS) guided by device-level metrics. Yet most hardware-aware NAS pipelines still optimize architectures under full-precision assumptions and apply low-precision adaptation only after the search, leading to a mismatch between optimization-time behavior and deployment-time execution on low-precision hardware that can substantially degrade accuracy. We address this limitation by integrating deployment-aligned low-precision training directly into hardware-aware NAS. Candidate architectures are exposed to FP16 numerical constraints during fine-tuning and evaluation, enabling joint optimization of architectural efficiency and numerical robustness without modifying the search space or evolutionary strategy. We evaluate the proposed framework on vessel segmentation for spaceborne maritime monitoring, targeting the Intel Movidius Myriad X Visual Processing Unit (VPU). While post-training precision conversion reduces on-device performance from 0.85 to 0.78 mIoU, deployment-aligned low-precision training achieves 0.826 mIoU on-device for the same architecture (95,791 parameters), recovering approximately two-thirds of deployment-induced accuracy gap without increasing model complexity. These results demonstrate that incorporating deployment-consistent numerical constraints into hardware-aware NAS substantially improves robustness and alignment between optimization and deployment for resource-constrained edge Artificial Intelligence (AI).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes integrating deployment-aligned low-precision (FP16) training directly into hardware-aware neural architecture search (NAS) for edge AI. Candidate architectures are exposed to FP16 numerical constraints during fine-tuning and evaluation phases of the NAS process, enabling joint optimization of efficiency and robustness without changes to the search space or evolutionary strategy. Evaluated on vessel segmentation for spaceborne maritime monitoring targeting the Intel Movidius Myriad X VPU, post-training precision conversion drops performance from 0.85 to 0.78 mIoU, while the proposed method achieves 0.826 mIoU on-device for the same 95,791-parameter architecture, recovering approximately two-thirds of the deployment-induced accuracy gap.
Significance. If the results hold under rigorous validation, this approach could meaningfully advance hardware-aware NAS for low-precision edge devices by reducing the mismatch between optimization and deployment. The empirical on-device measurements on a real VPU for a space application provide practical value, and the lack of added model complexity strengthens the case for adoption in resource-constrained settings.
major comments (2)
- Abstract: The headline claim that deployment-aligned low-precision training recovers ~2/3 of the 0.85-to-0.78 mIoU gap (achieving 0.826 mIoU on-device for the 95,791-param model) rests on the assumption that FP16 simulation during NAS fine-tuning and evaluation accurately predicts real Myriad X VPU behavior; no verification of VPU-specific effects such as rounding modes, saturation, or fused multiply-add precision is provided, risking that the reported alignment is an artifact of the simulation rather than true deployment robustness.
- Evaluation section: No error bars, standard deviations, or statistics from multiple runs are reported for the mIoU figures, and there is no ablation isolating the contribution of the FP16 exposure within NAS from the base evolutionary search strategy or other factors; this undermines confidence in the cross-condition comparison and the claim that no modifications to search space or strategy were needed.
minor comments (3)
- Provide more explicit details on the implementation of FP16 constraints (e.g., quantization simulation method, scaling, or clipping) and how they were selected to approximate the target VPU.
- Include additional baselines such as standard post-training quantization applied after full-precision NAS or other low-precision NAS variants for better context on the improvement magnitude.
- Clarify reproducibility aspects including exact search space definition, evolutionary parameters, and training hyperparameters even if the strategy itself is unchanged.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major comment below and indicate planned revisions to strengthen the paper.
read point-by-point responses
-
Referee: Abstract: The headline claim that deployment-aligned low-precision training recovers ~2/3 of the 0.85-to-0.78 mIoU gap (achieving 0.826 mIoU on-device for the 95,791-param model) rests on the assumption that FP16 simulation during NAS fine-tuning and evaluation accurately predicts real Myriad X VPU behavior; no verification of VPU-specific effects such as rounding modes, saturation, or fused multiply-add precision is provided, risking that the reported alignment is an artifact of the simulation rather than true deployment robustness.
Authors: We thank the referee for this observation. The reported 0.826 mIoU is measured via direct on-device inference on the Intel Movidius Myriad X VPU after deployment, not in simulation. The FP16 simulation is applied only during NAS fine-tuning and evaluation to select architectures that are numerically robust under low precision. We used standard framework-level FP16 emulation without custom modeling of VPU-specific rounding, saturation, or FMA behaviors. In the revised manuscript we will add a paragraph in the evaluation section clarifying these simulation assumptions and noting that the on-device results provide independent empirical confirmation of the robustness gains. revision: partial
-
Referee: Evaluation section: No error bars, standard deviations, or statistics from multiple runs are reported for the mIoU figures, and there is no ablation isolating the contribution of the FP16 exposure within NAS from the base evolutionary search strategy or other factors; this undermines confidence in the cross-condition comparison and the claim that no modifications to search space or strategy were needed.
Authors: We agree that error bars from multiple runs and a dedicated ablation would increase confidence. Performing repeated full NAS runs is computationally prohibitive on the target hardware, so results are from a single evolutionary search (standard practice in many NAS works). The comparison isolates the effect of adding FP16 exposure during NAS versus post-training quantization on the identical architecture found by the unchanged search. In revision we will insert a limitations subsection explaining the single-run reporting, the rationale for leaving the evolutionary strategy unmodified, and any retrievable run-to-run variance from intermediate logs. revision: partial
Circularity Check
No circularity: empirical NAS integration with on-device validation
full rationale
The paper reports an empirical framework that inserts FP16 constraints into existing hardware-aware NAS (fine-tuning and evaluation phases) and measures resulting on-device mIoU on the Myriad X VPU. No equations, derivations, or fitted parameters are presented that reduce to their own inputs by construction. The headline recovery (0.826 mIoU) is an observed experimental outcome, not a prediction forced by a model fit or self-citation chain. The method re-uses a standard evolutionary NAS strategy without claimed uniqueness theorems or ansatzes imported from prior author work. Self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Matt Poyser and Toby P. Breckon. Neural architec- ture search: A contemporary literature review for com- puter vision applications.Pattern Recognition, 147:110052,
-
[2]
ISSN 0031-3203. doi: https : / / doi . org / 10 . 1016 / j . patcog . 2023 . 110052. URLhttps : / / www . sciencedirect . com / science / article / pii / S0031320323007495. 1
work page 2023
-
[3]
Roberto Del Prete, Parampuneet Kaur Thind, Andrea Mazzeo, Matthew Whitley, Lorenzo Papa, Nicolas Long´ep´e, and Gabriele Meoni. Optimizing deep learning models for on-orbit deployment through neural architecture search.Sci- entific Reports, 15(1):37783, 2025. doi: 10.1038/s41598- 025-21467-8. 1, 2, 3
-
[4]
Naseo: Neural architecture search for earth observation on- board processing
Parampuneet Kaur Thind, Roberto Del Prete, Matthew Whit- ley, Andrea Mazzeo, Nicolas Long´ep´e, and Gabriele Meoni. Naseo: Neural architecture search for earth observation on- board processing. In2025 European Data Handling & Data Processing Conference (EDHPC), pages 1–8, 2025. 1, 2
work page 2025
-
[5]
Dpp-net: Device-aware progressive search for pareto-optimal neural architectures, 2018
Jin-Dong Dong, An-Chieh Cheng, Da-Cheng Juan, Wei Wei, and Min Sun. Dpp-net: Device-aware progressive search for pareto-optimal neural architectures, 2018. URLhttps: //arxiv.org/abs/1806.08198. 1, 3
-
[6]
Parampuneet Kaur Thind, Charles Mwangi, Giovanni Varetto, Lorenzo Sarti, Andrea Papa, and Andrea Taramelli. Assessing the added value of onboard earth observation pro- cessing with the iride heo service segment. InProceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2026. Accepted for publication. 1
work page 2026
-
[7]
Overview of ESA’s Earth Ob- servation upcoming small satellites missions
Massimiliano Pastena, Michel Tossaint, Amanda Regan, Michele Castorina, Pierre Mathieu, Josep Rosello, Antonio Gabriele, and Nicola Melega. Overview of ESA’s Earth Ob- servation upcoming small satellites missions. 08 2020
work page 2020
-
[8]
Development and implementation of theΦSat-2 mission
Nicola Melega, Nicolas Longepe, Agne Paskeviciute, Valentina Marchese, Oriol Aragon, Irina Babkina, Alessan- dro Marin, Jakub Nalepa, Leonie Buckley, Giorgia Guer- risi, Sofia Oliveira, and Hano Steyn. Development and implementation of theΦSat-2 mission. In Max Petrozzi- Ilstad, editor,Small Satellites Systems and Services Sym- posium (4S 2024), volume 13...
-
[9]
Roberto Del Prete, Gabriele Meoni, Nicolas Long ´ep´e, Maria Daniela Graziano, and Alfredo Renga. First results of vessel detection with onboard processing of sentinel-2 raw data by deep learning. InIGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium, pages 6262–6265, 2023. doi: 10.1109/IGARSS52108.2023. 10283401. 2
-
[10]
Cristopher Castro Traba, David Rijlaarsdam, Jian Guo, Roberto Del Prete, and Gabriele Meoni. Towards onboard thermal hotspots segmentation with raw multispectral satel- lite imagery.International Journal of Applied Earth Obser- vation and Geoinformation, 146:105095, 2026. ISSN 1569-
work page 2026
-
[11]
doi: https://doi.org/10.1016/j.jag.2026.105095. URL https : / / www . sciencedirect . com / science / article/pii/S1569843226000117. 2
-
[12]
Gonzalo Mateo-Garcia, Joshua Veitch-Michaelis, Lewis Smith, Silviu Vlad Oprea, Guy Schumann, Yarin Gal, Atılım G¨unes ¸ Baydin, and Dietmar Backes. Towards global flood mapping onboard low cost satellites with machine learning.Scientific Reports, 11(1):7249, 2021
work page 2021
-
[13]
Maria Pia Del Rosso, Alessandro Sebastianelli, Dario Spiller, Pierre Philippe Mathieu, and Silvia Liberata Ullo. On-board volcanic eruption detection through CNNss and satellite multispectral imagery.Remote Sensing, 13(17): 3479, 2021. 2
work page 2021
-
[14]
AI-enabled onboard edge computing for satel- lite intelligence in disaster management.https://www
UN-SPIDER. AI-enabled onboard edge computing for satel- lite intelligence in disaster management.https://www. un- spider.org/news- and- events, 2025. UN- SPIDER news archive (original article removed); Accessed: 2026-02-12. 2
work page 2025
-
[15]
Bing Zhang, Yuanfeng Wu, Boya Zhao, Jocelyn Chanussot, Danfeng Hong, Jing Yao, and Lianru Gao. Progress and chal- lenges in intelligent remote sensing satellite systems.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15:1814–1822, 2022. 2
work page 2022
-
[16]
Nicola Melega, Bernardo A. Carnicero, Nicolas Long ´ep´e, Aiste Paskeviciute, Valerio Marchese, Oriol Aragon, Irina Babkina, Alessandro Marin, Jakub Nalepa, and Leonie Buckley. Implementation of theϕsat-2 on-board image pro- cessing chain. InSensors, Systems, and Next-Generation Satellites XXVII, volume 12729, pages 264–276. SPIE, 2023. 2
work page 2023
-
[17]
Gianluca Furano, Gabriele Meoni, Aubrey Dunne, David Moloney, Veronique Ferlet-Cavrois, Antonis Tavoularis, Jonathan Byrne, L ´eonie Buckley, Mihalis Psarakis, Kay- Obbe V oss, and Luca Fanucci. Towards the use of artifi- cial intelligence on the edge in space systems: Challenges and opportunities.IEEE Aerospace and Electronic Systems Magazine, 35(12):44–...
-
[18]
Angela Cratere, Leandro Gagliardi, Gabriel A. Sanca, Fed- erico Golmar, and Francesco Dell’Olio. On-board computer for CubeSats: State-of-the-art and future trends.IEEE Ac- cess, 12:99537–99569, 2024. doi: 10.1109/ACCESS.2024. 3428388. 2
-
[19]
Ayoub Benali Amjoud and Mustapha Amrouch. Object de- tection using deep learning, CNNs and vision transform- ers: A review.IEEE Access, PP:1–1, 01 2023. doi: 10.1109/ACCESS.2023.3266093. 2
-
[20]
Yu Cheng, Duo Wang, Pan Zhou, and Tao Zhang. Model compression and acceleration for deep neural networks: The principles, progress, and challenges.IEEE Signal Processing Magazine, 35(1):126–136, 2018. 2
work page 2018
-
[21]
Tejalal Choudhary, Vipul Mishra, Anurag Goswami, and Ja- gannathan Sarangapani. A comprehensive survey on model compression and acceleration.Artificial Intelligence Review, 53(7):5113–5155, 2020. ISSN 1573-7462. doi: 10.1007/ s10462-020-09816-7. URLhttps://doi.org/10. 1007/s10462-020-09816-7. 2, 3
work page 2020
-
[22]
Devis Tuia, Konrad Schindler, Beg ¨um Demir, Xiao Xiang Zhu, Mrinalini Kochupillai, Saˇso Dˇzeroski, Jan N Van Rijn, Holger H Hoos, Fabio Del Frate, Mihai Datcu, et al. Ar- tificial intelligence to advance earth observation: A review of models, recent trends, and pathways forward.IEEE Geo- science and Remote Sensing Magazine, 2024. 2
work page 2024
-
[23]
Colin Reeves.Genetic Algorithms, volume 146, pages 109–
-
[24]
09 2010. ISBN 978-1-4419-1663-1. doi: 10.1007/978- 1-4419-1665-5 5. 3
-
[25]
Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. Reinforcement learning: A survey.Journal of Arti- ficial Intelligence Research, 4:237–285, 1996. doi: 10.1613/ jair.301. 3
work page 1996
-
[26]
Nas-bench- 1shot1: Benchmarking and dissecting one-shot neural archi- tecture search, 2020
Arber Zela, Julien Siems, and Frank Hutter. Nas-bench- 1shot1: Benchmarking and dissecting one-shot neural archi- tecture search, 2020. URLhttps://arxiv.org/abs/ 2001.10422. 3
-
[27]
DARTS: Differentiable Architecture Search
Hanxiao Liu, Karen Simonyan, and Yiming Yang. DARTS: Differentiable architecture search, 2019. URLhttps:// arxiv.org/abs/1806.09055
work page Pith review arXiv 2019
-
[28]
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
Han Cai, Ligeng Zhu, and Song Han. Proxylessnas: Direct neural architecture search on target task and hardware, 2019. URLhttps://arxiv.org/abs/1812.00332. 3
work page Pith review arXiv 2019
-
[29]
Povilas Gudzius, Olga Kurasova, Vytenis Darulis, and Ernestas Filatovas. AutoML-based neural architecture search for object recognition in satellite imagery.Remote Sensing, 15(1), 2023. ISSN 2072-4292. doi: 10.3390/ rs15010091. URLhttps://www.mdpi.com/2072- 4292/15/1/91. 3
work page 2023
-
[30]
Guangyuan Liu, Yangyang Li, Yanqiao Chen, Ronghua Shang, and Licheng Jiao. Pol-nas: A neural architec- ture search method with feature selection for polsar image classification.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15:9339–9354,
-
[31]
URLhttps://api.semanticscholar.org/ CorpusID:253321267
-
[32]
Amber Cassimon, Phil Reiter, Siegfried Mercelis, and Kevin Mets. Designing a classifier for active fire detection from multispectral satellite imagery using neural architecture search, 2024. URLhttps://arxiv.org/abs/2410. 05425. 3
work page 2024
-
[33]
A White Paper on Neural Network Quantization
Markus Nagel, Marios Fournarakis, Rana Ali Amjad, Yely- sei Bondarenko, Mart van Baalen, and Tijmen Blankevoort. A white paper on neural network quantization, 2021. URL https://arxiv.org/abs/2106.08295. 3
work page internal anchor Pith review arXiv 2021
-
[34]
Post-training 4-bit quantization of convolution networks for rapid-deployment
Ron Banner, Yury Nahshan, Elad Hoffer, and Daniel Soudry. Post-training 4-bit quantization of convolution networks for rapid-deployment, 2019. URLhttps://arxiv.org/ abs/1810.05723. 3
work page Pith review arXiv 2019
-
[35]
Yaohui Cai, Zhewei Yao, Zhen Dong, Amir Gholami, Michael W. Mahoney, and Kurt Keutzer. Zeroq: A novel zero shot quantization framework, 2020. URLhttps: //arxiv.org/abs/2001.00281. 3
-
[36]
Low-bit quantization of neural networks for efficient infer- ence, 2019
Yoni Choukroun, Eli Kravchik, Fan Yang, and Pavel Kisilev. Low-bit quantization of neural networks for efficient infer- ence, 2019. URLhttps://arxiv.org/abs/1902. 06822. 3
work page 2019
-
[37]
L ´eopold Cambier, Anahita Bhiwandiwalla, Ting Gong, Mehran Nekuii, Oguz H Elibol, and Hanlin Tang. Shifted and squeezed 8-bit floating point format for low-precision training of deep neural networks, 2020. URLhttps: //arxiv.org/abs/2001.05674. 3
-
[38]
Brian Chmiel, Liad Ben-Uri, Moran Shkolnik, Elad Hoffer, Ron Banner, and Daniel Soudry. Neural gradients are near- lognormal: improved quantized and sparse training, 2020. URLhttps://arxiv.org/abs/2006.08173. 3
-
[39]
PACT: Parameterized Clipping Activation for Quantized Neural Networks
Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, and Kailash Gopalakrishnan. Pact: Parameterized clipping activation for quantized neural networks, 2018. URLhttps://arxiv. org/abs/1805.06085. 5
work page Pith review arXiv 2018
-
[40]
Zikun Liu, Hongzhen Wang, Lubin Weng, and Yiping Yang. Hrsc2016, 2025. URLhttps://dx.doi.org/10. 21227/rgx1-sh71. 5
work page 2025
-
[41]
AMD Radeon™ Graph- ics.https : / / www
Advanced Micro Devices, Inc. AMD Radeon™ Graph- ics.https : / / www . amd . com / en / products / graphics / desktops / radeon . html, 2025. Ac- cessed: 2025-05-02. 6
work page 2025
-
[42]
Intel® Movidius™ Myriad™ X Vision Processing Unit.https : / / www
Intel Corporation. Intel® Movidius™ Myriad™ X Vision Processing Unit.https : / / www . intel . com / content / www / us / en / products / sku / 125926 / intel - movidius - myriad - x - vision - processing- unit- 4gb/specifications.html,
-
[43]
Accessed: 2025-05-02. 6
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.