arxiv: 2605.09290 · v1 · submitted 2026-05-10 · 💻 cs.LG

Recognition: no theorem link

From Regression to Inference: Meta-Learning Predictors for Neural Architecture Search

Liping Deng , Mingqing Xiao

Authors on Pith no claims yet

Pith reviewed 2026-05-12 03:54 UTC · model grok-4.3

classification 💻 cs.LG

keywords neural architecture searchmeta-learningperformance predictionConvolutional Neural ProcessNAS-Bench-101NAS-Bench-201generalizationconditional inference

0 comments

The pith

Reframing NAS performance prediction as meta-learned conditional inference improves generalization from few samples over standard regression.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Existing predictors for neural architecture search overfit when trained by regression on small numbers of evaluated architectures and fail to rank unseen candidates reliably. This paper instead casts prediction as inferring a performance function from partial observations, using a Convolutional Neural Process whose parameters are meta-learned by repeatedly splitting synthesized tasks into context and target sets. The training procedure directly optimizes for the data-scarce, generalization-heavy regime that occurs when the predictor is later deployed inside a NAS loop. Simple meta-features are defined for cell-based search spaces, and the resulting model is tested on NAS-Bench-101 and NAS-Bench-201. The experiments report higher top-K ranking quality and stronger final architecture selection than prior regression baselines when only limited evaluations are available.

Core claim

By training a Convolutional Neural Process on context-target splits drawn from groups of synthesized tasks, the model learns to infer architecture performance conditionally rather than fitting a fixed regression mapping; this produces predictors whose ranking behavior on unseen architectures is measurably stronger under the same sample budgets used in prior work.

What carries the argument

Convolutional Neural Process meta-trained via context-target splits on synthesized tasks, which performs conditional function inference from partial performance observations.

If this is right

Higher top-K ranking quality on NAS-Bench-101 and NAS-Bench-201 when only limited architecture evaluations are available.
State-of-the-art final architecture selection accuracy under the same constrained evaluation budgets.
Training procedure explicitly aligns with the partial-observation setting that occurs during NAS deployment.
Meta-features for cell-based architectures enable the conditional inference approach without hand-crafted architecture encodings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same meta-inference formulation could be applied to other black-box optimization domains where only sparse evaluations of candidate solutions are feasible.
If the synthesized-task distribution is broadened, the predictor might support transfer across different cell-based or macro search spaces without retraining from scratch.
The emphasis on generalization under scarcity suggests the method could serve as a drop-in replacement for regression heads in other sample-efficient machine-learning pipelines.

Load-bearing premise

The distribution of performance functions encountered during meta-training on synthesized tasks matches the distribution of real, unseen architectures that appear when the predictor is used inside an actual NAS search.

What would settle it

Run the meta-learned predictor and a regression baseline on a fresh NAS benchmark whose architectures and performance statistics were never seen during training or synthesis, then check whether the meta-learned model still produces higher top-K ranking accuracy at the same small sample size.

Figures

Figures reproduced from arXiv: 2605.09290 by Liping Deng, Mingqing Xiao.

**Figure 2.** Figure 2: The comparisons on Recall@k over NAS101, where k ranges from 10 to 100 [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: The comparisons on Recall@k over NAS201. [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

read the original abstract

Prediction-based approaches are widely used in neural architecture search (NAS), where a predictor estimates the performance of candidate architectures to guide selection. However, existing predictors are typically trained via supervised regression on limited samples, leading to overfitting and poor generalization to unseen architectures. In this work, we propose a fundamentally different formulation that models performance prediction as a conditional function inference problem using a Convolutional Neural Process (ConvNP) with meta-learning capabilities. Instead of fitting a fixed mapping to limited samples, our approach meta-learns to infer performance from partial observations by training with context-target splits across a group of synthesized tasks, explicitly optimizing for generalization under data scarcity and aligning the training procedure with the deployment setting in NAS. We further design simple yet effective meta-features for cell-based architectures and evaluate our method on NAS-Bench-101 and NAS-Bench-201. Extensive experiments show that our approach consistently improves top-K ranking quality and achieves the state-of-the-art architecture selection using limited samples.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reframes NAS performance prediction as ConvNP meta-learning on context-target splits from synthesized tasks, but the abstract leaves the experimental details too thin to judge whether the reported gains are real.

read the letter

The main thing to know is that this work shifts from standard supervised regression for NAS predictors to a meta-learning setup with Convolutional Neural Processes. They train by splitting synthesized tasks into context and target sets so the model learns to infer performance from partial observations, which they argue matches the actual NAS deployment setting better than fitting a fixed regressor to limited samples. They add simple meta-features for cell-based architectures and test on NAS-Bench-101 and 201, claiming better top-K ranking and state-of-the-art selection with few evaluations.

Referee Report

2 major / 2 minor

Summary. The paper claims that reframing NAS performance prediction as conditional function inference via a meta-learned Convolutional Neural Process (ConvNP), trained on context-target splits across synthesized tasks rather than standard supervised regression, yields consistent gains in top-K ranking quality and achieves state-of-the-art architecture selection on NAS-Bench-101 and NAS-Bench-201 under limited samples. It introduces simple meta-features for cell-based architectures and argues that the meta-training procedure aligns with NAS deployment.

Significance. If the generalization from synthesized tasks holds and the empirical gains are robust, the work offers a principled shift from overfitting-prone regression predictors to inference-oriented meta-learners in NAS. This could improve sample efficiency in architecture search and encourage similar meta-learning formulations in other low-data prediction settings within the field.

major comments (2)

[Abstract and §4] Abstract and §4 (experimental evaluation): the central empirical claim of consistent top-K improvements and SOTA selection is reported without details on the exact baselines compared, any statistical significance tests, ablation studies isolating the contribution of the ConvNP meta-learning versus the meta-features, or quantitative characterization of how the synthesized tasks were constructed. These omissions make it impossible to verify whether the reported gains arise from the proposed inference formulation or from unstated differences in experimental protocol.
[§3] §3 (task synthesis and meta-training procedure): the generalization assumption—that a ConvNP meta-learned on context-target splits from synthesized tasks will produce ranking behavior matching the true performance distribution of unseen cell-based architectures in NAS-Bench-101/201—is load-bearing for the entire contribution, yet no ablation or distributional comparison (e.g., correlation statistics between architecture encodings and accuracies in synthetic vs. real tasks) is provided. Without this, the performance gains could be artifacts of the synthetic distribution rather than a genuine advance in inference under data scarcity.

minor comments (2)

[Abstract] The abstract and introduction use the term 'meta-features' without an explicit definition or example in the main text; a short table or figure illustrating the feature construction for a sample cell would improve clarity.
[§3] Notation for context-target splits and the ConvNP conditioning is introduced without a dedicated equation or diagram in the methods; adding a small schematic would aid readers unfamiliar with Neural Processes.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments, which have helped us improve the clarity and rigor of our work. We address each major comment in detail below and indicate the revisions made to the manuscript.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (experimental evaluation): the central empirical claim of consistent top-K improvements and SOTA selection is reported without details on the exact baselines compared, any statistical significance tests, ablation studies isolating the contribution of the ConvNP meta-learning versus the meta-features, or quantitative characterization of how the synthesized tasks were constructed. These omissions make it impossible to verify whether the reported gains arise from the proposed inference formulation or from unstated differences in experimental protocol.

Authors: We fully agree that these details are essential for reproducibility and verification. In the revised manuscript, we have updated the abstract and §4 to provide: a complete enumeration of the baselines with citations and key hyperparameters; results from statistical significance testing (using 5-fold cross-validation and paired t-tests with p-values reported); ablation experiments that compare the full model against (i) a standard supervised regressor with meta-features, (ii) a ConvNP without meta-learning, and (iii) meta-learning without the convolutional structure; and a precise description of the synthesized task generation process, specifying the architecture sampling distribution, the number of tasks, and the performance proxy model used. These additions directly address the concern about unstated protocol differences. revision: yes
Referee: [§3] §3 (task synthesis and meta-training procedure): the generalization assumption—that a ConvNP meta-learned on context-target splits from synthesized tasks will produce ranking behavior matching the true performance distribution of unseen cell-based architectures in NAS-Bench-101/201—is load-bearing for the entire contribution, yet no ablation or distributional comparison (e.g., correlation statistics between architecture encodings and accuracies in synthetic vs. real tasks) is provided. Without this, the performance gains could be artifacts of the synthetic distribution rather than a genuine advance in inference under data scarcity.

Authors: The referee correctly identifies the importance of validating the generalization from synthetic to real tasks. The current version demonstrates this through strong empirical results on the target benchmarks, but we concur that explicit comparisons would strengthen the argument. Accordingly, we have revised §3 to include distributional analyses, such as the correlation between meta-feature encodings and performance values in synthetic versus real data (reporting Pearson and Spearman coefficients), as well as an additional experiment meta-training on one benchmark and testing on the other. This helps confirm that the inference formulation, rather than synthetic artifacts, drives the improvements. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper reformulates NAS performance prediction as a meta-learning inference task using an existing Convolutional Neural Process (ConvNP) architecture trained on context-target splits from synthesized tasks. No equations or claims reduce the reported gains in top-K ranking or SOTA selection to a fitted parameter defined inside the paper, a self-citation chain, or a renaming of known results. The central procedure is a direct application of prior meta-learning methods to the NAS domain, with empirical validation on NAS-Bench-101 and NAS-Bench-201 providing independent content. The generalization assumption from synthetic to real distributions is a correctness risk rather than a circularity issue.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that meta-training with context-target splits on synthesized tasks will transfer to real NAS search spaces; no new entities are postulated and the ConvNP component is taken from prior literature.

axioms (1)

domain assumption A Convolutional Neural Process can effectively meta-learn conditional inference of performance functions from partial observations.
Invoked when the authors replace regression with ConvNP meta-learning to handle data scarcity.

pith-pipeline@v0.9.0 · 5460 in / 1299 out tokens · 64949 ms · 2026-05-12T03:54:24.786296+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

[1]

Encodings for pr ediction-based neural architecture search

Yash Akhauri and Mohamed S Abdelfattah. Encodings for pr ediction-based neural architecture search. In Proceedings of the 41st International Conference on Machin e Learning, pages 740–759. PMLR, 2024

work page 2024
[2]

Accelerating neural architecture search using performance prediction

Bowen Baker, Otkrist Gupta, Ramesh Raskar, and Nikhil Na ik. Accelerating neural architecture search using performance prediction. arXiv preprint arXiv:1705.10823, 2017

work page arXiv 2017
[3]

Ritchie, and Nick Weston

Andrew Brock, Theo Lim, J.M. Ritchie, and Nick Weston. SM ASH: One-shot model architecture search through hypernetworks. In International Conference on Learning Representations , 2018. URL https://openreview.net/forum?id=rydeCEhs-

work page 2018
[4]

Xgboost: A scalable tre e boosting system

Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tre e boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discover y and data mining , pages 785–794, 2016

work page 2016
[5]

Peephole: Predic ting network performance before training

Boyang Deng, Junjie Yan, and Dahua Lin. Peephole: Predic ting network performance before training. arXiv preprint arXiv:1712.03351, 2017

work page arXiv 2017
[6]

NAS-Bench-201: Extending the sc ope of reproducible neural architecture search

Xuanyi Dong and Yi Yang. NAS-Bench-201: Extending the sc ope of reproducible neural architecture search. In International Conference on Learning Representations , 2020. URL https://openreview.net/forum?id=HJxyZkBKDr

work page 2020
[7]

Neura l process family

Yann Dubois, Jonathan Gordon, and Andrew YK Foong. Neura l process family. http://yanndubs.github.io/Neural-Process-Family/ , September 2020

work page 2020
[8]

BRP-NAS: Prediction-based nas using gcns

Lukasz Dudziak, Thomas Chau, Mohamed Abdelfattah, Roys on Lee, Hyeji Kim, and Nicholas Lane. BRP-NAS: Prediction-based nas using gcns. Advances in neural information processing systems , 33: 10480–10490, 2020

work page 2020
[9]

Meta-learning of neural ar- chitectures for few-shot learning

Thomas Elsken, Benedikt Stafﬂer, Jan Hendrik Metzen, an d Frank Hutter. Meta-learning of neural ar- chitectures for few-shot learning. In Proceedings of the IEEE/CVF conference on computer vision a nd pattern recognition, pages 12365–12375, 2020

work page 2020
[10]

Meta-learning stationary stochastic process prediction w ith convolutional neural processes

Andrew Foong, Wessel Bruinsma, Jonathan Gordon, Yann D ubois, James Requeima, and Richard Turner. Meta-learning stationary stochastic process prediction w ith convolutional neural processes. Advances in Neural Information Processing Systems, 33:8284–8295, 2020

work page 2020
[11]

Conditi onal neural processes

Marta Garnelo, Dan Rosenbaum, Christopher Maddison, T iago Ramalho, David Saxton, Murray Shana- han, Yee Whye Teh, Danilo Rezende, and SM Ali Eslami. Conditi onal neural processes. In International conference on machine learning, pages 1704–1713. PMLR, 2018

work page 2018
[12]

Dai, and Quoc V

David Ha, Andrew M. Dai, and Quoc V . Le. Hypernetworks. I n International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=rkpACe1lx

work page 2017
[13]

Tapas: Train-less acc uracy predictor for architecture search

Roxana Istrate, Florian Scheidegger, Giovanni Marian i, Dimitrios Nikolopoulos, Constantine Bekas, and Adelmo Cristiano Innocenza Malossi. Tapas: Train-less acc uracy predictor for architecture search. In Proceedings of the AAAI conference on artiﬁcial intelligen ce, volume 33, pages 3927–3934, 2019. 10

work page 2019
[14]

CAP: a context-aware ne ural predictor for nas

Han Ji, Y uqi Feng, and Yanan Sun. CAP: a context-aware ne ural predictor for nas. In Proceedings of the Thirty-Third International Joint Conference on Artiﬁc ial Intelligence, IJCAI ’24, 2024. ISBN 978-1- 956792-04-1. doi: 10.24963/ijcai.2024/466. URL https://doi.org/10.24963/ijcai.2024/466

work page doi:10.24963/ijcai.2024/466 2024
[15]

CARL: Causa lity-guided architecture representation learn- ing for an interpretable performance predictor

Han Ji, Y uqi Feng, Jiahao Fan, and Yanan Sun. CARL: Causa lity-guided architecture representation learn- ing for an interpretable performance predictor. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 23019–23029, 2025

work page 2025
[16]

Attentive neural processes

Hyunjik Kim, Andriy Mnih, Jonathan Schwarz, Marta Garn elo, Ali Eslami, Dan Rosenbaum, Oriol Vinyals, and Yee Whye Teh. Attentive neural processes. In International Conference on Learning Repre- sentations, 2019. URL https://openreview.net/forum?id=SkE6PjC9KX

work page 2019
[17]

Generic neural architecture search via regression

Y uhong Li, Cong Hao, Pan Li, Jinjun Xiong, and Deming Che n. Generic neural architecture search via regression. Advances in Neural Information Processing Systems , 34:20476–20490, 2021

work page 2021
[18]

Progressive neural archi tecture search

Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlen s, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Y uille, Jonathan Huang, and Kevin Murphy. Progressive neural archi tecture search. In Proceedings of the Euro- pean conference on computer vision (ECCV) , pages 19–34, 2018

work page 2018
[19]

DARTS: Di fferentiable architec- ture search

Hanxiao Liu, Karen Simonyan, and Yiming Yang. DARTS: Di fferentiable architec- ture search. In International Conference on Learning Representations , 2019. URL https://openreview.net/forum?id=S1eYHoC5FX

work page 2019
[20]

Homogeneous architecture augmentation for neural predictor

Y uqiao Liu, Yehui Tang, and Yanan Sun. Homogeneous architecture augmentation for neural predictor. In Proceedings of the IEEE/CVF International Conference on Co mputer Vision, pages 12249–12258, 2021

work page 2021
[21]

Bridge the gap between architecture spaces via a cross-domain predictor

Y uqiao Liu, Yehui Tang, Zeqiong Lv, Y unhe Wang, and Yanan Sun. Bridge the gap between architecture spaces via a cross-domain predictor. Advances in Neural Information Processing Systems , 35:13355– 13366, 2022

work page 2022
[22]

Neural architecture perfor- mance prediction using graph neural networks

Jovita Lukasik, David Friede, Heiner Stuckenschmidt, and Margret Keuper. Neural architecture perfor- mance prediction using graph neural networks. In DAGM German Conference on Pattern Recognition , pages 188–201. Springer, 2020

work page 2020
[23]

Si ngle-domain generalized predictor for neural architecture search system

Lianbo Ma, Haidong Kang, Guo Y u, Qing Li, and Qiang He. Si ngle-domain generalized predictor for neural architecture search system. IEEE Transactions on Computers, 73(5):1400–1413, 2024

work page 2024
[24]

A generic graph-based neural architecture encoding scheme for predictor-based nas

Xuefei Ning, Yin Zheng, Tianchen Zhao, Y u Wang, and Huaz hong Yang. A generic graph-based neural architecture encoding scheme for predictor-based nas. In European Conference on Computer Vision , pages 189–204. Springer, 2020

work page 2020
[25]

TA-GA TES: An encoding scheme for neural network architectures

Xuefei Ning, Zixuan Zhou, Junbo Zhao, Tianchen Zhao, Yi ping Deng, Changcheng Tang, Shuang Liang, Huazhong Yang, and Y u Wang. TA-GA TES: An encoding scheme for neural network architectures. Ad- vances in Neural Information Processing Systems , 35:32325–32339, 2022

work page 2022
[26]

Neural architecture search with interpretable meta -features and fast predictors

Gean T Pereira, Iury BA Santos, Luis PF Garcia, Thierry U rruty, Muriel Visani, and André CPLF De Car- valho. Neural architecture search with interpretable meta -features and fast predictors. Information Sci- ences, 649:119642, 2023

work page 2023
[27]

Regularized evolution for image classiﬁer architecture search

Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V L e. Regularized evolution for image classiﬁer architecture search. In Proceedings of the AAAI Conference on Artiﬁcial Intelligen ce, volume 33, pages 4780–4789, 2019

work page 2019
[28]

Introduction to information retrieval, volume 39

Hinrich Schütze, Christopher D Manning, and Prabhakar Raghavan. Introduction to information retrieval, volume 39. Cambridge University Press Cambridge, 2008

work page 2008
[29]

Me ta architecture search

Albert Shaw, Wei Wei, Weiyang Liu, Le Song, and Bo Dai. Me ta architecture search. Advances in Neural Information Processing Systems, 32, 2019

work page 2019
[30]

Bridging the gap between sample-based and one-shot neural architecture search with bonas

Han Shi, Renjie Pi, Hang Xu, Zhenguo Li, James Kwok, and T ong Zhang. Bridging the gap between sample-based and one-shot neural architecture search with bonas. Advances in Neural Information Pro- cessing Systems, 33:1808–1819, 2020

work page 2020
[31]

A semi-supervised assessor of neural architectur es

Yehui Tang, Y unhe Wang, Yixing Xu, Hanting Chen, Boxin S hi, Chao Xu, Chunjing Xu, Qi Tian, and Chang Xu. A semi-supervised assessor of neural architectur es. In proceedings of the IEEE/CVF confer- ence on computer vision and pattern recognition , pages 1810–1819, 2020

work page 2020
[32]

Global co nvergence of MAML and theory-inspired neural architecture search for few-shot learning

Haoxiang Wang, Yite Wang, Ruoyu Sun, and Bo Li. Global co nvergence of MAML and theory-inspired neural architecture search for few-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9797–9808, 2022. 11

work page 2022
[33]

Neural predictor for neural architecture search

Wei Wen, Hanxiao Liu, Yiran Chen, Hai Li, Gabriel Bender , and Pieter-Jan Kindermans. Neural predictor for neural architecture search. In European conference on computer vision , pages 660–676. Springer, 2020

work page 2020
[34]

A study on encodings for neural architec- ture search

Colin White, Willie Neiswanger, Sam Nolen, and Yash Sav ani. A study on encodings for neural architec- ture search. Advances in neural information processing systems , 33:20309–20319, 2020

work page 2020
[35]

White, M

Colin White, Mahmoud Safari, Rhea Sukthanker, Binxin R u, Thomas Elsken, Arber Zela, Debadeepta Dey, and Frank Hutter. Neural architecture search: Insight s from 1000 papers. arXiv preprint arXiv:2301.08727, 2023

work page arXiv 2023
[36]

Nas-bench- 101: Towards reproducible neural architecture search

Chris Ying, Aaron Klein, Eric Christiansen, Esteban Re al, Kevin Murphy, and Frank Hutter. Nas-bench- 101: Towards reproducible neural architecture search. In International conference on machine learning , pages 7105–7114. PMLR, 2019

work page 2019
[37]

Few-shot neural architec- ture search

Yiyang Zhao, Linnan Wang, Y uandong Tian, Rodrigo Fonse ca, and Tian Guo. Few-shot neural architec- ture search. In International Conference on Machine Learning , pages 12707–12718. PMLR, 2021

work page 2021
[38]

Neural architecture search wit h reinforcement learning

Barret Zoph and Quoc Le. Neural architecture search wit h reinforcement learning. In International Con- ference on Learning Representations, 2017. URL https://openreview.net/forum?id=r1Ue8Hcxg. 12

work page 2017