EPNAS: Efficient Progressive Neural Architecture Search
Pith reviewed 2026-05-25 01:13 UTC · model grok-4.3
The pith
EPNAS uses a progressive search policy with REINFORCE performance prediction to find high-accuracy networks faster than prior NAS methods on image tasks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
EPNAS efficiently handles large search spaces through a novel progressive search policy with performance prediction based on REINFORCE. It searches target networks in parallel, which is more scalable on parallel systems such as GPU/TPU clusters. More importantly, EPNAS can be generalized to architecture search with multiple resource constraints, e.g., model size, compute complexity or intensity. On both CIFAR10 and ImageNet, EPNAS is superior with respect to architecture searching speed and recognition accuracy.
What carries the argument
Progressive search policy with REINFORCE-based performance prediction that ranks architectures without full training of each candidate.
If this is right
- EPNAS applies directly to searches under simultaneous constraints such as model size and compute intensity.
- Parallel network evaluation scales the method to GPU and TPU clusters without serial bottlenecks.
- The same policy yields architectures that exceed MobileNetV2 accuracy on both CIFAR10 and ImageNet.
- Resource-aware search becomes feasible for mobile and cloud platforms without separate runs per constraint.
Where Pith is reading between the lines
- If the ranking prediction generalizes, EPNAS could shorten development cycles for custom models on new datasets.
- The parallel design suggests straightforward extension to distributed training setups beyond single clusters.
- Constraint handling may allow direct optimization for latency targets on specific hardware without post-search pruning.
Load-bearing premise
The REINFORCE-based performance prediction accurately ranks candidate architectures in large search spaces without requiring full training of each candidate.
What would settle it
A head-to-head run on ImageNet where EPNAS produces lower top-1 accuracy or longer total search time than ENAS or PNAS under identical constraints would falsify the superiority claim.
Figures
read the original abstract
In this paper, we propose Efficient Progressive Neural Architecture Search (EPNAS), a neural architecture search (NAS) that efficiently handles large search space through a novel progressive search policy with performance prediction based on REINFORCE~\cite{Williams.1992.PG}. EPNAS is designed to search target networks in parallel, which is more scalable on parallel systems such as GPU/TPU clusters. More importantly, EPNAS can be generalized to architecture search with multiple resource constraints, \eg, model size, compute complexity or intensity, which is crucial for deployment in widespread platforms such as mobile and cloud. We compare EPNAS against other state-of-the-art (SoTA) network architectures (\eg, MobileNetV2~\cite{mobilenetv2}) and efficient NAS algorithms (\eg, ENAS~\cite{pham2018efficient}, and PNAS~\cite{Liu2017b}) on image recognition tasks using CIFAR10 and ImageNet. On both datasets, EPNAS is superior \wrt architecture searching speed and recognition accuracy.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes EPNAS, a neural architecture search algorithm that employs a progressive search policy combined with a REINFORCE-based performance predictor to efficiently explore large search spaces. It emphasizes parallel search on GPU/TPU clusters and generalization to multiple resource constraints (model size, compute). Experiments on CIFAR-10 and ImageNet are claimed to show superiority over MobileNetV2, ENAS, and PNAS in both search speed and final recognition accuracy.
Significance. If the REINFORCE predictor's rankings prove reliable and the efficiency/accuracy claims are substantiated with proper controls, the work would offer a practical advance in scalable, constraint-aware NAS suitable for deployment on varied hardware platforms.
major comments (1)
- [Abstract] Abstract: the central claims of superiority in search speed and accuracy rest on the unvalidated assumption that the REINFORCE performance predictor produces rankings that correlate with true post-training accuracies. No rank-correlation statistics, held-out validation of the predictor, or ablation against random ranking are referenced, making it impossible to assess whether the reported gains are load-bearing or artifacts of the search procedure.
Simulated Author's Rebuttal
We thank the referee for the careful review and the specific comment on validation of the performance predictor. We address this point below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claims of superiority in search speed and accuracy rest on the unvalidated assumption that the REINFORCE performance predictor produces rankings that correlate with true post-training accuracies. No rank-correlation statistics, held-out validation of the predictor, or ablation against random ranking are referenced, making it impossible to assess whether the reported gains are load-bearing or artifacts of the search procedure.
Authors: We agree that the manuscript would be strengthened by explicit validation of the REINFORCE predictor. In the revised version we will add (1) Spearman's rank correlation between predictor scores and final accuracies on a held-out set of 200 architectures, (2) a description of how the predictor was trained and validated during search, and (3) an ablation replacing the learned predictor with random ranking while keeping all other components fixed. These additions will allow readers to judge whether the reported speed and accuracy gains depend on the quality of the rankings. revision: yes
Circularity Check
No circularity: method uses external RL baseline without self-referential reduction
full rationale
The abstract and description present EPNAS as employing a REINFORCE-based predictor within a progressive search policy, with claims of superiority on CIFAR-10 and ImageNet. No equations, fitting procedures, or derivation steps are supplied that would allow a reduction (e.g., a performance prediction shown to be identical to its training targets by construction, or a uniqueness result imported solely via self-citation). The REINFORCE reference is to an external 1992 paper. Absent any load-bearing self-citation chain or ansatz smuggled through prior author work, the derivation chain cannot be shown to collapse to its inputs. This is the expected outcome for an empirical NAS description lacking internal mathematical closure.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
https://ai.googleblog.com/2017/08/ launching-speech-commands-dataset.html
Launching the speech commands dataset. https://ai.googleblog.com/2017/08/ launching-speech-commands-dataset.html
work page 2017
-
[2]
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
D. Amodei et al. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin. arXiv:1512.02595, December 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[3]
An evolutionary algorithm that constructs recurrent neural networks
Peter J Angeline et al. An evolutionary algorithm that constructs recurrent neural networks. IEEE transactions on Neural Networks, 5(1):54–65, 1994
work page 1994
-
[4]
Designing Neural Network Architectures using Reinforcement Learning
Bowen Baker et al. Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[5]
Accelerating Neural Architecture Search using Performance Prediction
Bowen Baker et al. Accelerating neural architecture search using performance prediction. arXiv preprint arXiv:1705.10823, 2017. Y ANQI ZHOU ET.AL.: EPNAS 11
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[6]
Under- standing and simplifying one-shot architecture search
Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph, Vijay Vasudevan, and Quoc Le. Under- standing and simplifying one-shot architecture search. In International Conference on Machine Learning, pages 549–558, 2018
work page 2018
-
[7]
Random search for hyper-parameter optimization
James Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization. J. Mach. Learn. Res. , 13:281–305, February 2012. ISSN 1532-4435. URL http://dl.acm.org/ citation.cfm?id=2188385.2188395
-
[8]
Handbook of markov chain monte carlo
Steve Brooks, Andrew Gelman, Galin Jones, and Xiao-Li Meng. Handbook of markov chain monte carlo. CRC press, 2011
work page 2011
-
[9]
Efficient architecture search by network transformation
Han Cai et al. Efficient architecture search by network transformation. AAAI, 2018
work page 2018
-
[10]
F. Chollet. Xception: Deep Learning with Depthwise Separable Convolutions. arXiv: 1610.02357, October 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[11]
Imagenet: A large- scale hierarchical image database
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large- scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009
work page 2009
-
[12]
Dpp-net: Device- aware progressive search for pareto-optimal neural architectures
Jin-Dong Dong, An-Chieh Cheng, Da-Cheng Juan, Wei Wei, and Min Sun. Dpp-net: Device- aware progressive search for pareto-optimal neural architectures. ECCV, 2018
work page 2018
-
[13]
Neural Architecture Search: A Survey
Thomas Elsken et al. Neural architecture search: A survey. arXiv preprint arXiv:1808.05377 , 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[14]
Morphnet: Fast & simple resource-constrained structure learning of deep networks
Ariel Gordon, Elad Eban, Ofir Nachum, Bo Chen, Hao Wu, Tien-Ju Yang, and Edward Choi. Morphnet: Fast & simple resource-constrained structure learning of deep networks. In Proceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition , pages 1586–1595, 2018
work page 2018
-
[15]
Learning both weights and connections for efficient neural networks
Song Han et al. Learning both weights and connections for efficient neural networks. NIPS, pages 1135–1143, 2015
work page 2015
-
[16]
Channel pruning for accelerating very deep neural networks
Yihui He, Xiangyu Zhang, and Jian Sun. Channel pruning for accelerating very deep neural networks. In The IEEE International Conference on Computer Vision (ICCV) , Oct 2017
work page 2017
-
[17]
Amc: Automl for model compression and acceleration on mobile devices
Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han. Amc: Automl for model compression and acceleration on mobile devices. In Proceedings of the European Conference on Computer Vision (ECCV), pages 784–800, 2018
work page 2018
-
[18]
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Andrew G. Howard et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR, abs/1704.04861, 2017. URL http://arxiv.org/abs/1704.04861
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[19]
Densely connected convolutional networks
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. In CVPR, volume 1, page 3, 2017
work page 2017
-
[20]
Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations
Itay Hubara et al. Quantized neural networks: Training neural networks with low precision weights and activations. arXiv:1609.07061, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[21]
Neural Architecture Search with Bayesian Optimisation and Optimal Transport
Kirthevasan Kandasamy et al. Neural architecture search with bayesian optimisation and optimal transport. CoRR, abs/1802.07191, 2018. URL http://arxiv.org/abs/1802.07191
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[22]
Progressive Growing of GANs for Improved Quality, Stability, and Variation
T. Karras et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation. arXiv: 1710.10196, October 2017. 12 Y ANQI ZHOU ET.AL.: EPNAS
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[23]
Learning multiple layers of features from tiny images
Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009
work page 2009
-
[24]
Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. The cifar-10 dataset. online: http://www. cs. toronto. edu/kriz/cifar . html, 55, 2014
work page 2014
-
[25]
Microsoft coco: Common objects in context
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer, 2014
work page 2014
-
[26]
Sparse convolutional neural networks
Baoyuan Liu et al. Sparse convolutional neural networks. In CVPR, pages 806–814, June 2015
work page 2015
-
[27]
Progressive neural architecture search
Chenxi Liu et al. Progressive neural architecture search. In ECCV, pages 19–34, 2018
work page 2018
-
[28]
DARTS: Differentiable Architecture Search
Hanxiao Liu, Karen Simonyan, and Yiming Yang. Darts: Differentiable architecture search.arXiv preprint arXiv:1806.09055, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[29]
Hierarchical representations for efficient architecture search
Hanxiao Liu et al. Hierarchical representations for efficient architecture search. ICLR, 2018
work page 2018
-
[30]
SGDR: Stochastic Gradient Descent with Warm Restarts
Ilya Loshchilov and Frank Hutter. SGDR: stochastic gradient descent with restarts. CoRR, abs/1608.03983, 2016. URL http://arxiv.org/abs/1608.03983
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[31]
ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. Shufflenet v2: Practical guidelines for efficient cnn architecture design. arXiv preprint arXiv:1807.11164, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[32]
DeepArchitect: Automatically Designing and Training Deep Architectures
Renato Negrinho and Geoff Gordon. DeepArchitect: Automatically Designing and Training Deep Architectures. 2017. URL http://arxiv.org/abs/1704.08792
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[34]
Efficient Neural Architecture Search via Parameter Sharing
Hieu Pham et al. Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[35]
Large-Scale Evolution of Image Classifiers
Esteban Real et al. Large-scale evolution of image classifiers. arXiv preprint arXiv:1703.01041, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[36]
Regularized Evolution for Image Classifier Architecture Search
Esteban Real et al. Regularized evolution for image classifier architecture search. CoRR, abs/1802.01548, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[37]
Simple statistical gradient-following algorithms for connectionist reinforcement learning
Williams R.J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. 1992
work page 1992
-
[38]
Chris Ying Quoc V . Le Samuel L. Smith, Pieter-Jan Kindermans. Don’t decay the learning rate, increase the batch size. ICLR, 2018
work page 2018
-
[39]
MobileNetV2: Inverted Residuals and Linear Bottlenecks
M. Sandler et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks. arXiv: 1801.04381, January 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[40]
Very Deep Convolutional Networks for Large-Scale Image Recognition
K. Simonyan and A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556, September 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[41]
Structured Transforms for Small-Footprint Deep Learning
V . Sindhwani et al. Structured Transforms for Small-Footprint Deep Learning.arXiv:1510.01722, October 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[42]
Evolving neural networks through augmenting topologies
Kenneth O Stanley and Risto Miikkulainen. Evolving neural networks through augmenting topologies. Evolutionary computation, 10(2):99–127, 2002. Y ANQI ZHOU ET.AL.: EPNAS 13
work page 2002
-
[43]
MnasNet: Platform-Aware Neural Architecture Search for Mobile
Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, and Quoc V Le. Mnasnet: Platform- aware neural architecture search for mobile. arXiv preprint arXiv:1807.11626, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[44]
Efficient Multi-objective Neural Architecture Search via Lamarckian Evolution
Frank Hutter Thomas Elsken, Jan Hendrik Metzen. Multi-objective architecture search for cnns. CoRR, 2018. URL https://arxiv.org/abs/1804.09081
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[45]
Parallel WaveNet: Fast High-Fidelity Speech Synthesis
A. van den Oord et al. Parallel WaveNet: Fast High-Fidelity Speech Synthesis.arXiv:1711.10433, November 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[46]
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Y . Wu, Schuster, et al. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv:1609.08144, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[47]
Aggregated residual transformations for deep neural networks
Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. Aggregated residual transformations for deep neural networks. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on, pages 5987–5995. IEEE, 2017
work page 2017
-
[48]
Exploring Randomly Wired Neural Networks for Image Recognition
Saining Xie, Alexander Kirillov, Ross Girshick, and Kaiming He. Exploring randomly wired neural networks for image recognition. arXiv preprint arXiv:1904.01569, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1904
-
[49]
Snas: Stochastic neural architecture search
Sirui Xie, Hehui Zheng, Chunxiao Liu, and Liang Lin. Snas: Stochastic neural architecture search. ICLR, 2019
work page 2019
-
[50]
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
K. Xu et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. arXiv: 1502.03044, February 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[51]
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. Shufflenet: An extremely efficient convolutional neural network for mobile devices. CoRR, abs/1707.01083, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[52]
Barret Zoph and Quoc V . Le. Neural Architecture Search with Reinforcement Learning. 2016. ISSN 1938-7228. doi: 10.1016/j.knosys.2015.01.010. URL http://arxiv.org/abs/ 1611.01578
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1016/j.knosys.2015.01.010 2016
-
[53]
Learning transferable architec- tures for scalable image recognition
Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. Learning transferable architec- tures for scalable image recognition. CVPR, 2018
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.