Recognition: no theorem link
From Regression to Inference: Meta-Learning Predictors for Neural Architecture Search
Pith reviewed 2026-05-12 03:54 UTC · model grok-4.3
The pith
Reframing NAS performance prediction as meta-learned conditional inference improves generalization from few samples over standard regression.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By training a Convolutional Neural Process on context-target splits drawn from groups of synthesized tasks, the model learns to infer architecture performance conditionally rather than fitting a fixed regression mapping; this produces predictors whose ranking behavior on unseen architectures is measurably stronger under the same sample budgets used in prior work.
What carries the argument
Convolutional Neural Process meta-trained via context-target splits on synthesized tasks, which performs conditional function inference from partial performance observations.
If this is right
- Higher top-K ranking quality on NAS-Bench-101 and NAS-Bench-201 when only limited architecture evaluations are available.
- State-of-the-art final architecture selection accuracy under the same constrained evaluation budgets.
- Training procedure explicitly aligns with the partial-observation setting that occurs during NAS deployment.
- Meta-features for cell-based architectures enable the conditional inference approach without hand-crafted architecture encodings.
Where Pith is reading between the lines
- The same meta-inference formulation could be applied to other black-box optimization domains where only sparse evaluations of candidate solutions are feasible.
- If the synthesized-task distribution is broadened, the predictor might support transfer across different cell-based or macro search spaces without retraining from scratch.
- The emphasis on generalization under scarcity suggests the method could serve as a drop-in replacement for regression heads in other sample-efficient machine-learning pipelines.
Load-bearing premise
The distribution of performance functions encountered during meta-training on synthesized tasks matches the distribution of real, unseen architectures that appear when the predictor is used inside an actual NAS search.
What would settle it
Run the meta-learned predictor and a regression baseline on a fresh NAS benchmark whose architectures and performance statistics were never seen during training or synthesis, then check whether the meta-learned model still produces higher top-K ranking accuracy at the same small sample size.
Figures
read the original abstract
Prediction-based approaches are widely used in neural architecture search (NAS), where a predictor estimates the performance of candidate architectures to guide selection. However, existing predictors are typically trained via supervised regression on limited samples, leading to overfitting and poor generalization to unseen architectures. In this work, we propose a fundamentally different formulation that models performance prediction as a conditional function inference problem using a Convolutional Neural Process (ConvNP) with meta-learning capabilities. Instead of fitting a fixed mapping to limited samples, our approach meta-learns to infer performance from partial observations by training with context-target splits across a group of synthesized tasks, explicitly optimizing for generalization under data scarcity and aligning the training procedure with the deployment setting in NAS. We further design simple yet effective meta-features for cell-based architectures and evaluate our method on NAS-Bench-101 and NAS-Bench-201. Extensive experiments show that our approach consistently improves top-K ranking quality and achieves the state-of-the-art architecture selection using limited samples.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that reframing NAS performance prediction as conditional function inference via a meta-learned Convolutional Neural Process (ConvNP), trained on context-target splits across synthesized tasks rather than standard supervised regression, yields consistent gains in top-K ranking quality and achieves state-of-the-art architecture selection on NAS-Bench-101 and NAS-Bench-201 under limited samples. It introduces simple meta-features for cell-based architectures and argues that the meta-training procedure aligns with NAS deployment.
Significance. If the generalization from synthesized tasks holds and the empirical gains are robust, the work offers a principled shift from overfitting-prone regression predictors to inference-oriented meta-learners in NAS. This could improve sample efficiency in architecture search and encourage similar meta-learning formulations in other low-data prediction settings within the field.
major comments (2)
- [Abstract and §4] Abstract and §4 (experimental evaluation): the central empirical claim of consistent top-K improvements and SOTA selection is reported without details on the exact baselines compared, any statistical significance tests, ablation studies isolating the contribution of the ConvNP meta-learning versus the meta-features, or quantitative characterization of how the synthesized tasks were constructed. These omissions make it impossible to verify whether the reported gains arise from the proposed inference formulation or from unstated differences in experimental protocol.
- [§3] §3 (task synthesis and meta-training procedure): the generalization assumption—that a ConvNP meta-learned on context-target splits from synthesized tasks will produce ranking behavior matching the true performance distribution of unseen cell-based architectures in NAS-Bench-101/201—is load-bearing for the entire contribution, yet no ablation or distributional comparison (e.g., correlation statistics between architecture encodings and accuracies in synthetic vs. real tasks) is provided. Without this, the performance gains could be artifacts of the synthetic distribution rather than a genuine advance in inference under data scarcity.
minor comments (2)
- [Abstract] The abstract and introduction use the term 'meta-features' without an explicit definition or example in the main text; a short table or figure illustrating the feature construction for a sample cell would improve clarity.
- [§3] Notation for context-target splits and the ConvNP conditioning is introduced without a dedicated equation or diagram in the methods; adding a small schematic would aid readers unfamiliar with Neural Processes.
Simulated Author's Rebuttal
We thank the referee for their insightful comments, which have helped us improve the clarity and rigor of our work. We address each major comment in detail below and indicate the revisions made to the manuscript.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and §4 (experimental evaluation): the central empirical claim of consistent top-K improvements and SOTA selection is reported without details on the exact baselines compared, any statistical significance tests, ablation studies isolating the contribution of the ConvNP meta-learning versus the meta-features, or quantitative characterization of how the synthesized tasks were constructed. These omissions make it impossible to verify whether the reported gains arise from the proposed inference formulation or from unstated differences in experimental protocol.
Authors: We fully agree that these details are essential for reproducibility and verification. In the revised manuscript, we have updated the abstract and §4 to provide: a complete enumeration of the baselines with citations and key hyperparameters; results from statistical significance testing (using 5-fold cross-validation and paired t-tests with p-values reported); ablation experiments that compare the full model against (i) a standard supervised regressor with meta-features, (ii) a ConvNP without meta-learning, and (iii) meta-learning without the convolutional structure; and a precise description of the synthesized task generation process, specifying the architecture sampling distribution, the number of tasks, and the performance proxy model used. These additions directly address the concern about unstated protocol differences. revision: yes
-
Referee: [§3] §3 (task synthesis and meta-training procedure): the generalization assumption—that a ConvNP meta-learned on context-target splits from synthesized tasks will produce ranking behavior matching the true performance distribution of unseen cell-based architectures in NAS-Bench-101/201—is load-bearing for the entire contribution, yet no ablation or distributional comparison (e.g., correlation statistics between architecture encodings and accuracies in synthetic vs. real tasks) is provided. Without this, the performance gains could be artifacts of the synthetic distribution rather than a genuine advance in inference under data scarcity.
Authors: The referee correctly identifies the importance of validating the generalization from synthetic to real tasks. The current version demonstrates this through strong empirical results on the target benchmarks, but we concur that explicit comparisons would strengthen the argument. Accordingly, we have revised §3 to include distributional analyses, such as the correlation between meta-feature encodings and performance values in synthetic versus real data (reporting Pearson and Spearman coefficients), as well as an additional experiment meta-training on one benchmark and testing on the other. This helps confirm that the inference formulation, rather than synthetic artifacts, drives the improvements. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper reformulates NAS performance prediction as a meta-learning inference task using an existing Convolutional Neural Process (ConvNP) architecture trained on context-target splits from synthesized tasks. No equations or claims reduce the reported gains in top-K ranking or SOTA selection to a fitted parameter defined inside the paper, a self-citation chain, or a renaming of known results. The central procedure is a direct application of prior meta-learning methods to the NAS domain, with empirical validation on NAS-Bench-101 and NAS-Bench-201 providing independent content. The generalization assumption from synthetic to real distributions is a correctness risk rather than a circularity issue.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption A Convolutional Neural Process can effectively meta-learn conditional inference of performance functions from partial observations.
Reference graph
Works this paper leans on
-
[1]
Encodings for pr ediction-based neural architecture search
Yash Akhauri and Mohamed S Abdelfattah. Encodings for pr ediction-based neural architecture search. In Proceedings of the 41st International Conference on Machin e Learning, pages 740–759. PMLR, 2024
work page 2024
-
[2]
Accelerating neural architecture search using performance prediction
Bowen Baker, Otkrist Gupta, Ramesh Raskar, and Nikhil Na ik. Accelerating neural architecture search using performance prediction. arXiv preprint arXiv:1705.10823, 2017
-
[3]
Andrew Brock, Theo Lim, J.M. Ritchie, and Nick Weston. SM ASH: One-shot model architecture search through hypernetworks. In International Conference on Learning Representations , 2018. URL https://openreview.net/forum?id=rydeCEhs-
work page 2018
-
[4]
Xgboost: A scalable tre e boosting system
Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tre e boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discover y and data mining , pages 785–794, 2016
work page 2016
-
[5]
Peephole: Predic ting network performance before training
Boyang Deng, Junjie Yan, and Dahua Lin. Peephole: Predic ting network performance before training. arXiv preprint arXiv:1712.03351, 2017
-
[6]
NAS-Bench-201: Extending the sc ope of reproducible neural architecture search
Xuanyi Dong and Yi Yang. NAS-Bench-201: Extending the sc ope of reproducible neural architecture search. In International Conference on Learning Representations , 2020. URL https://openreview.net/forum?id=HJxyZkBKDr
work page 2020
-
[7]
Yann Dubois, Jonathan Gordon, and Andrew YK Foong. Neura l process family. http://yanndubs.github.io/Neural-Process-Family/ , September 2020
work page 2020
-
[8]
BRP-NAS: Prediction-based nas using gcns
Lukasz Dudziak, Thomas Chau, Mohamed Abdelfattah, Roys on Lee, Hyeji Kim, and Nicholas Lane. BRP-NAS: Prediction-based nas using gcns. Advances in neural information processing systems , 33: 10480–10490, 2020
work page 2020
-
[9]
Meta-learning of neural ar- chitectures for few-shot learning
Thomas Elsken, Benedikt Staffler, Jan Hendrik Metzen, an d Frank Hutter. Meta-learning of neural ar- chitectures for few-shot learning. In Proceedings of the IEEE/CVF conference on computer vision a nd pattern recognition, pages 12365–12375, 2020
work page 2020
-
[10]
Meta-learning stationary stochastic process prediction w ith convolutional neural processes
Andrew Foong, Wessel Bruinsma, Jonathan Gordon, Yann D ubois, James Requeima, and Richard Turner. Meta-learning stationary stochastic process prediction w ith convolutional neural processes. Advances in Neural Information Processing Systems, 33:8284–8295, 2020
work page 2020
-
[11]
Marta Garnelo, Dan Rosenbaum, Christopher Maddison, T iago Ramalho, David Saxton, Murray Shana- han, Yee Whye Teh, Danilo Rezende, and SM Ali Eslami. Conditi onal neural processes. In International conference on machine learning, pages 1704–1713. PMLR, 2018
work page 2018
-
[12]
David Ha, Andrew M. Dai, and Quoc V . Le. Hypernetworks. I n International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=rkpACe1lx
work page 2017
-
[13]
Tapas: Train-less acc uracy predictor for architecture search
Roxana Istrate, Florian Scheidegger, Giovanni Marian i, Dimitrios Nikolopoulos, Constantine Bekas, and Adelmo Cristiano Innocenza Malossi. Tapas: Train-less acc uracy predictor for architecture search. In Proceedings of the AAAI conference on artificial intelligen ce, volume 33, pages 3927–3934, 2019. 10
work page 2019
-
[14]
CAP: a context-aware ne ural predictor for nas
Han Ji, Y uqi Feng, and Yanan Sun. CAP: a context-aware ne ural predictor for nas. In Proceedings of the Thirty-Third International Joint Conference on Artific ial Intelligence, IJCAI ’24, 2024. ISBN 978-1- 956792-04-1. doi: 10.24963/ijcai.2024/466. URL https://doi.org/10.24963/ijcai.2024/466
-
[15]
Han Ji, Y uqi Feng, Jiahao Fan, and Yanan Sun. CARL: Causa lity-guided architecture representation learn- ing for an interpretable performance predictor. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 23019–23029, 2025
work page 2025
-
[16]
Hyunjik Kim, Andriy Mnih, Jonathan Schwarz, Marta Garn elo, Ali Eslami, Dan Rosenbaum, Oriol Vinyals, and Yee Whye Teh. Attentive neural processes. In International Conference on Learning Repre- sentations, 2019. URL https://openreview.net/forum?id=SkE6PjC9KX
work page 2019
-
[17]
Generic neural architecture search via regression
Y uhong Li, Cong Hao, Pan Li, Jinjun Xiong, and Deming Che n. Generic neural architecture search via regression. Advances in Neural Information Processing Systems , 34:20476–20490, 2021
work page 2021
-
[18]
Progressive neural archi tecture search
Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlen s, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Y uille, Jonathan Huang, and Kevin Murphy. Progressive neural archi tecture search. In Proceedings of the Euro- pean conference on computer vision (ECCV) , pages 19–34, 2018
work page 2018
-
[19]
DARTS: Di fferentiable architec- ture search
Hanxiao Liu, Karen Simonyan, and Yiming Yang. DARTS: Di fferentiable architec- ture search. In International Conference on Learning Representations , 2019. URL https://openreview.net/forum?id=S1eYHoC5FX
work page 2019
-
[20]
Homogeneous architecture augmentation for neural predictor
Y uqiao Liu, Yehui Tang, and Yanan Sun. Homogeneous architecture augmentation for neural predictor. In Proceedings of the IEEE/CVF International Conference on Co mputer Vision, pages 12249–12258, 2021
work page 2021
-
[21]
Bridge the gap between architecture spaces via a cross-domain predictor
Y uqiao Liu, Yehui Tang, Zeqiong Lv, Y unhe Wang, and Yanan Sun. Bridge the gap between architecture spaces via a cross-domain predictor. Advances in Neural Information Processing Systems , 35:13355– 13366, 2022
work page 2022
-
[22]
Neural architecture perfor- mance prediction using graph neural networks
Jovita Lukasik, David Friede, Heiner Stuckenschmidt, and Margret Keuper. Neural architecture perfor- mance prediction using graph neural networks. In DAGM German Conference on Pattern Recognition , pages 188–201. Springer, 2020
work page 2020
-
[23]
Si ngle-domain generalized predictor for neural architecture search system
Lianbo Ma, Haidong Kang, Guo Y u, Qing Li, and Qiang He. Si ngle-domain generalized predictor for neural architecture search system. IEEE Transactions on Computers, 73(5):1400–1413, 2024
work page 2024
-
[24]
A generic graph-based neural architecture encoding scheme for predictor-based nas
Xuefei Ning, Yin Zheng, Tianchen Zhao, Y u Wang, and Huaz hong Yang. A generic graph-based neural architecture encoding scheme for predictor-based nas. In European Conference on Computer Vision , pages 189–204. Springer, 2020
work page 2020
-
[25]
TA-GA TES: An encoding scheme for neural network architectures
Xuefei Ning, Zixuan Zhou, Junbo Zhao, Tianchen Zhao, Yi ping Deng, Changcheng Tang, Shuang Liang, Huazhong Yang, and Y u Wang. TA-GA TES: An encoding scheme for neural network architectures. Ad- vances in Neural Information Processing Systems , 35:32325–32339, 2022
work page 2022
-
[26]
Neural architecture search with interpretable meta -features and fast predictors
Gean T Pereira, Iury BA Santos, Luis PF Garcia, Thierry U rruty, Muriel Visani, and André CPLF De Car- valho. Neural architecture search with interpretable meta -features and fast predictors. Information Sci- ences, 649:119642, 2023
work page 2023
-
[27]
Regularized evolution for image classifier architecture search
Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V L e. Regularized evolution for image classifier architecture search. In Proceedings of the AAAI Conference on Artificial Intelligen ce, volume 33, pages 4780–4789, 2019
work page 2019
-
[28]
Introduction to information retrieval, volume 39
Hinrich Schütze, Christopher D Manning, and Prabhakar Raghavan. Introduction to information retrieval, volume 39. Cambridge University Press Cambridge, 2008
work page 2008
-
[29]
Albert Shaw, Wei Wei, Weiyang Liu, Le Song, and Bo Dai. Me ta architecture search. Advances in Neural Information Processing Systems, 32, 2019
work page 2019
-
[30]
Bridging the gap between sample-based and one-shot neural architecture search with bonas
Han Shi, Renjie Pi, Hang Xu, Zhenguo Li, James Kwok, and T ong Zhang. Bridging the gap between sample-based and one-shot neural architecture search with bonas. Advances in Neural Information Pro- cessing Systems, 33:1808–1819, 2020
work page 2020
-
[31]
A semi-supervised assessor of neural architectur es
Yehui Tang, Y unhe Wang, Yixing Xu, Hanting Chen, Boxin S hi, Chao Xu, Chunjing Xu, Qi Tian, and Chang Xu. A semi-supervised assessor of neural architectur es. In proceedings of the IEEE/CVF confer- ence on computer vision and pattern recognition , pages 1810–1819, 2020
work page 2020
-
[32]
Global co nvergence of MAML and theory-inspired neural architecture search for few-shot learning
Haoxiang Wang, Yite Wang, Ruoyu Sun, and Bo Li. Global co nvergence of MAML and theory-inspired neural architecture search for few-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9797–9808, 2022. 11
work page 2022
-
[33]
Neural predictor for neural architecture search
Wei Wen, Hanxiao Liu, Yiran Chen, Hai Li, Gabriel Bender , and Pieter-Jan Kindermans. Neural predictor for neural architecture search. In European conference on computer vision , pages 660–676. Springer, 2020
work page 2020
-
[34]
A study on encodings for neural architec- ture search
Colin White, Willie Neiswanger, Sam Nolen, and Yash Sav ani. A study on encodings for neural architec- ture search. Advances in neural information processing systems , 33:20309–20319, 2020
work page 2020
- [35]
-
[36]
Nas-bench- 101: Towards reproducible neural architecture search
Chris Ying, Aaron Klein, Eric Christiansen, Esteban Re al, Kevin Murphy, and Frank Hutter. Nas-bench- 101: Towards reproducible neural architecture search. In International conference on machine learning , pages 7105–7114. PMLR, 2019
work page 2019
-
[37]
Few-shot neural architec- ture search
Yiyang Zhao, Linnan Wang, Y uandong Tian, Rodrigo Fonse ca, and Tian Guo. Few-shot neural architec- ture search. In International Conference on Machine Learning , pages 12707–12718. PMLR, 2021
work page 2021
-
[38]
Neural architecture search wit h reinforcement learning
Barret Zoph and Quoc Le. Neural architecture search wit h reinforcement learning. In International Con- ference on Learning Representations, 2017. URL https://openreview.net/forum?id=r1Ue8Hcxg. 12
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.