CoLLM-NAS: Collaborative Large Language Models for Efficient Knowledge-Guided Neural Architecture Search

Yongtao Wang; Zhe Li; Zhiwei Lin

arxiv: 2509.26037 · v2 · pith:WKPDHE7Rnew · submitted 2025-09-30 · 💻 cs.AI · cs.CV· cs.LG

CoLLM-NAS: Collaborative Large Language Models for Efficient Knowledge-Guided Neural Architecture Search

Zhe Li , Zhiwei Lin , Yongtao Wang This is my paper

Pith reviewed 2026-05-21 20:58 UTC · model grok-4.3

classification 💻 cs.AI cs.CVcs.LG

keywords neural architecture searchlarge language modelscollaborative LLMsknowledge-guided searchefficient NAStwo-stage NASImageNetNAS-Bench-201

0 comments

The pith

A pair of large language models, one steering search direction and one generating candidates, delivers state-of-the-art neural architectures at 4 to 10 times lower search cost.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CoLLM-NAS as a two-stage framework that pairs a stateful Navigator LLM to set search direction with a stateless Generator LLM to create concrete architecture candidates. A Coordinator module handles their exchanges and incorporates evaluation feedback plus prior results into the process. This combination draws on the models' pre-trained understanding of neural network structures while adding progressive refinement from each iteration. A sympathetic reader would care because traditional neural architecture search often demands enormous computation or produces invalid designs; a reliable way to cut those costs could let more researchers explore tailored networks for specific hardware or tasks.

Core claim

CoLLM-NAS is a two-stage NAS framework that uses a stateful Navigator LLM to guide search direction, a stateless Generator LLM to synthesize high-quality candidates, and a Coordinator module to orchestrate inter-LLM communication and manage evaluation processes. The method efficiently guides the search by combining LLMs' inherent knowledge of structured neural architectures with progressive knowledge from iterative feedback and historical trajectory. Experimental results on ImageNet and NAS-Bench-201 show that CoLLM-NAS surpasses existing NAS methods and conventional search algorithms, achieving new state-of-the-art results while significantly reducing search costs by 4--10. Furthermore, CoL

What carries the argument

The CoLLM-NAS two-stage framework consisting of a stateful Navigator LLM for directional guidance, a stateless Generator LLM for candidate synthesis, and a Coordinator that manages communication and feedback integration.

If this is right

Achieves new state-of-the-art results on ImageNet and NAS-Bench-201 while cutting search costs by a factor of 4 to 10.
Consistently improves both accuracy and search efficiency when applied to existing two-stage NAS methods such as OFA, SPOS, and AutoFormer.
Generalizes across multiple search spaces including those for MobileNet, ShuffleNet, and AutoFormer variants.
Enables knowledge-guided search that avoids many of the invalid architectures produced by prior LLM-NAS approaches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same division of labor between one model that maintains search state and another that proposes concrete solutions might transfer to automated design problems outside neural networks, such as optimizing compiler passes or molecular structures.
Groups with modest computing budgets could use the approach to generate competitive custom models without renting large GPU clusters for weeks.
Explicit separation of directional guidance from candidate generation may offer a template for other multi-agent LLM systems that must balance exploration with concrete output.

Load-bearing premise

The assumption that pairing a stateful Navigator LLM with a stateless Generator LLM and feeding back evaluation results through a Coordinator will consistently produce valid, high-performing architectures without the invalidity or inefficiency problems of earlier LLM-based searches.

What would settle it

Running the method on NAS-Bench-201 and finding that the top architectures discovered do not exceed the accuracy of the best previously reported entries or that total search time remains within a factor of two of standard evolutionary or reinforcement-learning baselines.

Figures

Figures reproduced from arXiv: 2509.26037 by Yongtao Wang, Zhe Li, Zhiwei Lin.

**Figure 2.** Figure 2: Pipeline of CoLLM-NAS. The search starts with the Navigator LLM generating an initial [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: T-SNE visualization of CoLLM-NAS’s search dynamics on ImageNet-16-120 within NAS [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Ablation on main mechanisms: (a) Comparison of iterative performance between CoLLM [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Performance comparison of different temperature settings on CIFAR-100 within NAS [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

read the original abstract

The integration of Large Language Models (LLMs) with Neural Architecture Search (NAS) has introduced new possibilities for automating the design of neural architectures. However, most existing methods face critical limitations, including architectural invalidity, computational inefficiency, and inferior performance compared to traditional NAS. In this work, we present Collaborative LLM-based NAS (CoLLM-NAS), a two-stage NAS framework with knowledge-guided search driven by two complementary LLMs. Specifically, we propose a stateful Navigator LLM to guide search direction, a stateless Generator LLM to synthesize high-quality candidates, and a Coordinator module to orchestrate inter-LLM communication and manage evaluation processes. CoLLM-NAS efficiently guides the search process by combining LLMs' inherent knowledge of structured neural architectures with progressive knowledge from iterative feedback and historical trajectory. Experimental results on ImageNet and NAS-Bench-201 show that CoLLM-NAS surpasses existing NAS methods and conventional search algorithms, achieving new state-of-the-art results while significantly reducing search costs by 4--10. Furthermore, CoLLM-NAS consistently enhances the performance and efficiency of various two-stage NAS methods (e.g., OFA, SPOS, and AutoFormer) across diverse search spaces (e.g., MobileNet, ShuffleNet, and AutoFormer), demonstrating its excellent generalization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CoLLM-NAS splits LLM work into stateful Navigator and stateless Generator roles with a Coordinator, claiming SOTA results and 4-10x cost cuts, but the efficiency numbers look vulnerable to uncounted LLM overhead.

read the letter

Colleague, the key point on this CoLLM-NAS paper is that it frames NAS as a collaboration between a stateful Navigator LLM that steers the search using history and a stateless Generator that produces candidates, managed by a Coordinator. They get SOTA results on ImageNet and NAS-Bench-201 while cutting search costs by 4-10 times. The new angle is the explicit split in LLM responsibilities plus the coordinator for handling feedback loops. That seems like a step beyond just prompting a single LLM for architectures. They also show it can improve other methods like OFA and SPOS in various spaces, which is a solid practical demonstration if the experiments back it up. Where it gets soft is the cost reduction. The abstract highlights big savings, but doesn't clarify whether the tally includes all the LLM inference steps across iterations. If those add up significantly, the net gain over prior work could shrink. The lack of error bars or significance tests in the summary also leaves the performance edge open to question. This paper targets folks in automated machine learning who are experimenting with LLMs for design tasks. Someone looking for ways to make NAS more guided and less brute force might find the role definitions useful. It deserves a serious referee because the setup is described clearly enough and uses common benchmarks. Revisions would probably focus on tightening the evaluation and cost details. I'd recommend putting it through peer review with attention to how they measured the search costs.

Referee Report

2 major / 2 minor

Summary. The paper presents CoLLM-NAS, a two-stage collaborative LLM-based NAS framework consisting of a stateful Navigator LLM to guide search direction, a stateless Generator LLM to synthesize candidates, and a Coordinator module to orchestrate communication and evaluation. It claims to surpass existing NAS methods and conventional algorithms with new state-of-the-art results on ImageNet and NAS-Bench-201 while reducing search costs by a factor of 4-10, and to generalize by enhancing other two-stage NAS methods (OFA, SPOS, AutoFormer) across search spaces such as MobileNet, ShuffleNet, and AutoFormer.

Significance. If the performance and efficiency results hold under rigorous controls, the work would meaningfully advance LLM-guided NAS by addressing common issues of architectural invalidity and inefficiency through iterative knowledge feedback and inter-LLM coordination. The reported generalization across multiple search spaces and base methods is a positive aspect that could broaden practical applicability of automated architecture design.

major comments (2)

[Abstract] Abstract: The central claim of 'significantly reducing search costs by 4--10' is load-bearing for the efficiency contribution. The description provides no information on the cost metric (e.g., whether cumulative LLM inference overhead from repeated Navigator-Generator-Coordinator cycles is included versus only architecture training/validation costs on ImageNet or NAS-Bench-201), which directly affects whether the 4-10x advantage over prior methods like SPOS or OFA can be substantiated.
[Experimental Results] Experimental Results: The abstract asserts SOTA performance and superiority over existing NAS methods, yet supplies no details on baselines, number of runs, statistical significance, error bars, or controls for LLM output stochasticity. These omissions prevent evaluation of the reliability of the reported accuracy gains and generalization claims.

minor comments (2)

[Abstract] Abstract: The phrasing 'reducing search costs by 4--10' is imprecise; it should read 'by a factor of 4 to 10' for clarity.
[Abstract] Abstract: The roles of the Navigator, Generator, and Coordinator could be defined in one additional sentence to aid readers new to the collaborative setup.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback and detailed comments on our manuscript. We address each major comment point by point below, agreeing that greater clarity on cost metrics and experimental reporting will strengthen the presentation. We commit to incorporating these revisions in the next version of the paper.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim of 'significantly reducing search costs by 4--10' is load-bearing for the efficiency contribution. The description provides no information on the cost metric (e.g., whether cumulative LLM inference overhead from repeated Navigator-Generator-Coordinator cycles is included versus only architecture training/validation costs on ImageNet or NAS-Bench-201), which directly affects whether the 4-10x advantage over prior methods like SPOS or OFA can be substantiated.

Authors: We appreciate this observation on the cost metric. In CoLLM-NAS, the 4-10x reduction refers to the number of architecture evaluations (i.e., training and validation costs on ImageNet or NAS-Bench-201) enabled by the knowledge-guided iterative search, which is the standard metric in the NAS literature for comparing search efficiency against methods like SPOS and OFA. LLM inference overhead is not the primary component and is typically orders of magnitude smaller than GPU training costs, but we acknowledge the abstract does not explicitly clarify this distinction. We will revise the abstract to specify the cost metric and add a dedicated paragraph in the Experimental Results section providing a breakdown of total compute, including relative LLM overhead. revision: yes
Referee: [Experimental Results] Experimental Results: The abstract asserts SOTA performance and superiority over existing NAS methods, yet supplies no details on baselines, number of runs, statistical significance, error bars, or controls for LLM output stochasticity. These omissions prevent evaluation of the reliability of the reported accuracy gains and generalization claims.

Authors: Thank you for emphasizing the need for rigorous experimental details. The manuscript already compares against multiple baselines (SPOS, OFA, AutoFormer, and conventional algorithms) with results on ImageNet and NAS-Bench-201, and demonstrates generalization across search spaces. To improve reliability assessment, we will expand the Experimental Results section to report the number of independent runs, include mean performance with standard deviations (error bars), discuss controls for LLM stochasticity (e.g., fixed temperature and repeated sampling), and add statistical significance tests for key comparisons. These additions will be incorporated without altering the core claims. revision: yes

Circularity Check

0 steps flagged

Empirical NAS framework exhibits no circular derivation

full rationale

The paper presents CoLLM-NAS as an empirical two-stage search procedure combining stateful Navigator LLM, stateless Generator LLM, and Coordinator orchestration. No equations, first-principles derivations, or predictions are claimed that reduce by construction to fitted parameters or self-citations. Performance and cost results are reported from external benchmarks (ImageNet, NAS-Bench-201) and generalization tests on spaces like MobileNet. The method is self-contained against these benchmarks with no load-bearing self-citation chains or ansatz smuggling. This is the expected non-finding for an applied search algorithm rather than a closed mathematical claim.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on the domain assumption that LLMs already encode useful knowledge about valid neural architectures and that iterative feedback can be effectively integrated without introducing new invalidity issues. The Coordinator is introduced as a new orchestration component without external validation.

axioms (1)

domain assumption LLMs possess inherent knowledge of structured neural architectures that can be leveraged to guide search
Explicitly invoked in the abstract as the basis for knowledge-guided search.

invented entities (1)

Coordinator module no independent evidence
purpose: Orchestrate inter-LLM communication and manage evaluation processes
New component introduced to coordinate the Navigator and Generator.

pith-pipeline@v0.9.0 · 5772 in / 1385 out tokens · 66214 ms · 2026-05-21T20:58:50.243737+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose a stateful Navigator LLM to guide search direction, a stateless Generator LLM to synthesize high-quality candidates, and a Coordinator module to orchestrate inter-LLM communication
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

combining LLMs’ inherent knowledge of structured neural architectures with progressive knowledge from iterative feedback and historical trajectory

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Structuring Open-Ended NAS: Semi-Automated Design Knowledge Structuring with LLMs for Efficient Neural Architecture Search
cs.CV 2026-05 unverdicted novelty 6.0

Authors structure architectural design knowledge with LLMs to create an open-ended NAS space and introduce FairNAD, which finds architectures improving 0.84, 2.17, and 2.35 points over SOTA on CIFAR-10, CIFAR-100, and...
LLM as a Tool, Not an Agent: Code-Mined Tree Transformations for Neural Architecture Search
cs.LG 2026-04 unverdicted novelty 6.0

LLMasTool improves neural architecture search by evolving code-mined hierarchical trees with diversity-guided Bayesian planning and targeted LLM assistance, reporting gains of 0.69, 1.83, and 2.68 points on CIFAR-10, ...

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · cited by 2 Pith papers · 2 internal anchors

[1]

Claude sonnet 4

Anthropic. Claude sonnet 4. https://www.anthropic.com/claude, 2025. Accessed: 2025-09-22

work page 2025
[2]

Once-for-all: Train one network and specialize it for efficient deployment

Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. Once-for-all: Train one network and specialize it for efficient deployment. InInternational Conference on Learning Representations, 2020

work page 2020
[3]

ProxylessNAS: Direct neural architecture search on target task and hardware

Han Cai, Ligeng Zhu, and Song Han. ProxylessNAS: Direct neural architecture search on target task and hardware. InInternational Conference on Learning Representations, 2019

work page 2019
[4]

Evoprompting: Language models for code-level neural architecture search.Advances in neural information processing systems, 36:7787–7817, 2023

Angelica Chen, David Dohan, and David So. Evoprompting: Language models for code-level neural architecture search.Advances in neural information processing systems, 36:7787–7817, 2023

work page 2023
[5]

Autoformer: Searching trans- formers for visual recognition

Minghao Chen, Houwen Peng, Jianlong Fu, and Haibin Ling. Autoformer: Searching trans- formers for visual recognition. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 12270–12280, October 2021

work page 2021
[6]

{DARTS}- : Robustly stepping out of performance collapse without indicators

Xiangxiang Chu, Xiaoxing Wang, Bo Zhang, Shun Lu, Xiaolin Wei, and Junchi Yan. {DARTS}- : Robustly stepping out of performance collapse without indicators. InInternational Conference on Learning Representations, 2021

work page 2021
[7]

Fairnas: Rethinking evaluation fairness of weight sharing neural architecture search

Xiangxiang Chu, Bo Zhang, and Ruijun Xu. Fairnas: Rethinking evaluation fairness of weight sharing neural architecture search. InProceedings of the IEEE/CVF International Conference on computer vision, pages 12239–12248, 2021

work page 2021
[8]

Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning, 2025

DeepSeek-AI. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning, 2025

work page 2025
[9]

Imagenet: A large- scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large- scale hierarchical image database. In2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009

work page 2009
[10]

Nas-bench-201: Extending the scope of reproducible neural architecture search

Xuanyi Dong and Yi Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. InInternational Conference on Learning Representations, 2020

work page 2020
[11]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[12]

Single path one-shot neural architecture search with uniform sampling

Zichao Guo, Xiangyu Zhang, Haoyuan Mu, Wen Heng, Zechun Liu, Yichen Wei, and Jian Sun. Single path one-shot neural architecture search with uniform sampling. InComputer vision–ECCV 2020: 16th European conference, glasgow, UK, August 23–28, 2020, proceedings, part XVI 16, pages 544–560. Springer, 2020

work page 2020
[13]

Sumnas: Supernet with unbiased meta-features for neural architecture search

Hyeonmin Ha, Ji-Hoon Kim, Semin Park, and Byung-Gon Chun. Sumnas: Supernet with unbiased meta-features for neural architecture search. InInternational Conference on Learning Representations, 2022

work page 2022
[14]

Searching for mobilenetv3

Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, et al. Searching for mobilenetv3. InProceedings of the IEEE/CVF international conference on computer vision, pages 1314–1324, 2019

work page 2019
[15]

Greedynasv2: Greedier search with a greedy path filter

Tao Huang, Shan You, Fei Wang, Chen Qian, Changshui Zhang, Xiaogang Wang, and Chang Xu. Greedynasv2: Greedier search with a greedy path filter. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11902–11911, 2022

work page 2022
[16]

Subnet-aware dynamic supernet training for neural architecture search

Jeimin Jeon, Youngmin Oh, Junghyup Lee, Donghyeon Baek, Dohyung Kim, Chanho Eom, and Bumsub Ham. Subnet-aware dynamic supernet training for neural architecture search. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 30137–30146, 2025. 10

work page 2025
[17]

Rz-nas: Enhancing llm-guided neural architecture search via reflective zero-cost strategy

Zipeng Ji, Guanghui Zhu, Chunfeng Yuan, and Yihua Huang. Rz-nas: Enhancing llm-guided neural architecture search via reflective zero-cost strategy. InForty-second International Conference on Machine Learning, 2025

work page 2025
[18]

Gonzalez, Hao Zhang, and Ion Stoica

Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, and Ion Stoica. Efficient memory management for large lan- guage model serving with pagedattention. InProceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles, 2023

work page 2023
[19]

Random search and reproducibility for neural architecture search

Liam Li and Ameet Talwalkar. Random search and reproducibility for neural architecture search. InUncertainty in artificial intelligence, pages 367–377. PMLR, 2020

work page 2020
[20]

DARTS: Differentiable architecture search

Hanxiao Liu, Karen Simonyan, and Yiming Yang. DARTS: Differentiable architecture search. InInternational Conference on Learning Representations, 2019

work page 2019
[21]

Pa&da: Jointly sampling path and data for consistent nas

Shun Lu, Yu Hu, Longxing Yang, Zihao Sun, Jilin Mei, Jianchao Tan, and Chengru Song. Pa&da: Jointly sampling path and data for consistent nas. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11940–11949, 2023

work page 2023
[22]

Shufflenet v2: Practical guidelines for efficient cnn architecture design

Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. Shufflenet v2: Practical guidelines for efficient cnn architecture design. InProceedings of the European conference on computer vision (ECCV), pages 116–131, 2018

work page 2018
[23]

Llmatic: neural architecture search via large language models and quality diversity optimization

Muhammad Umair Nasir, Sam Earle, Julian Togelius, Steven James, and Christopher Cleghorn. Llmatic: neural architecture search via large language models and quality diversity optimization. Inproceedings of the Genetic and Evolutionary Computation Conference, pages 1110–1118, 2024

work page 2024
[24]

Gpt-5.https://openai.com/gpt-5, 2025

OpenAI. Gpt-5.https://openai.com/gpt-5, 2025. Accessed: 2025-09-22

work page 2025
[25]

Introducing openai o3 and o4-mini

OpenAI. Introducing openai o3 and o4-mini. https://openai.com/index/ o3-o4-mini-system-card/, April 2025. Accessed: 2025-09-22

work page 2025
[26]

Regularized evolution for image classifier architecture search

Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. Regularized evolution for image classifier architecture search. InProceedings of the aaai conference on artificial intelligence, volume 33, pages 4780–4789, 2019

work page 2019
[27]

Large-scale evolution of image classifiers

Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V Le, and Alexey Kurakin. Large-scale evolution of image classifiers. InInternational conference on machine learning, pages 2902–2911. PMLR, 2017

work page 2017
[28]

MobileNetV2: Inverted residuals and linear bottlenecks

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. MobileNetV2: Inverted residuals and linear bottlenecks. InCVPR, 2018

work page 2018
[29]

Mnasnet: Platform-aware neural architecture search for mobile

Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V Le. Mnasnet: Platform-aware neural architecture search for mobile. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2820–2828, 2019

work page 2019
[30]

Mingxing Tan and Quoc V . Le. Mixconv: Mixed depthwise convolutional kernels. InBMVC, page 74, 2019

work page 2019
[31]

Qwen3 technical report, 2025

Qwen Team. Qwen3 technical report, 2025

work page 2025
[32]

Evolutionary com- putation in the era of large language model: Survey and roadmap.IEEE Transactions on Evolutionary Computation, 2024

Xingyu Wu, Sheng-hao Wu, Jibin Wu, Liang Feng, and Kay Chen Tan. Evolutionary com- putation in the era of large language model: Survey and roadmap.IEEE Transactions on Evolutionary Computation, 2024

work page 2024
[33]

Pc-darts: Partial channel connections for memory-efficient architecture search

Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, and Hongkai Xiong. Pc-darts: Partial channel connections for memory-efficient architecture search. InInternational Conference on Learning Representations, 2020

work page 2020
[34]

Large language models as optimizers

Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V Le, Denny Zhou, and Xinyun Chen. Large language models as optimizers. InThe Twelfth International Conference on Learning Representations, 2024. 11

work page 2024
[35]

Greedynas: Towards fast one-shot nas with greedy supernet

Shan You, Tao Huang, Mingmin Yang, Fei Wang, Chen Qian, and Changshui Zhang. Greedynas: Towards fast one-shot nas with greedy supernet. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1999–2008, 2020

work page 1999
[36]

Can GPT -4 Perform Neural Architecture Search ?, August 2023

Mingkai Zheng, Xiu Su, Shan You, Fei Wang, Chen Qian, Chang Xu, and Samuel Albanie. Can gpt-4 perform neural architecture search?arXiv preprint arXiv:2304.10970, 2023

work page arXiv 2023
[37]

Neural Architecture Search with Reinforcement Learning

Barret Zoph and Quoc V Le. Neural architecture search with reinforcement learning.arXiv preprint arXiv:1611.01578, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[38]

C.2.2 User Prompt

Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. Learning transferable architectures for scalable image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 8697–8710, 2018. 12 A Key Experimental Settings Table 6: Key experimental settings. "same" indicates identical settings to the corresponding...

work page 2018

[1] [1]

Claude sonnet 4

Anthropic. Claude sonnet 4. https://www.anthropic.com/claude, 2025. Accessed: 2025-09-22

work page 2025

[2] [2]

Once-for-all: Train one network and specialize it for efficient deployment

Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. Once-for-all: Train one network and specialize it for efficient deployment. InInternational Conference on Learning Representations, 2020

work page 2020

[3] [3]

ProxylessNAS: Direct neural architecture search on target task and hardware

Han Cai, Ligeng Zhu, and Song Han. ProxylessNAS: Direct neural architecture search on target task and hardware. InInternational Conference on Learning Representations, 2019

work page 2019

[4] [4]

Evoprompting: Language models for code-level neural architecture search.Advances in neural information processing systems, 36:7787–7817, 2023

Angelica Chen, David Dohan, and David So. Evoprompting: Language models for code-level neural architecture search.Advances in neural information processing systems, 36:7787–7817, 2023

work page 2023

[5] [5]

Autoformer: Searching trans- formers for visual recognition

Minghao Chen, Houwen Peng, Jianlong Fu, and Haibin Ling. Autoformer: Searching trans- formers for visual recognition. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 12270–12280, October 2021

work page 2021

[6] [6]

{DARTS}- : Robustly stepping out of performance collapse without indicators

Xiangxiang Chu, Xiaoxing Wang, Bo Zhang, Shun Lu, Xiaolin Wei, and Junchi Yan. {DARTS}- : Robustly stepping out of performance collapse without indicators. InInternational Conference on Learning Representations, 2021

work page 2021

[7] [7]

Fairnas: Rethinking evaluation fairness of weight sharing neural architecture search

Xiangxiang Chu, Bo Zhang, and Ruijun Xu. Fairnas: Rethinking evaluation fairness of weight sharing neural architecture search. InProceedings of the IEEE/CVF International Conference on computer vision, pages 12239–12248, 2021

work page 2021

[8] [8]

Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning, 2025

DeepSeek-AI. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning, 2025

work page 2025

[9] [9]

Imagenet: A large- scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large- scale hierarchical image database. In2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009

work page 2009

[10] [10]

Nas-bench-201: Extending the scope of reproducible neural architecture search

Xuanyi Dong and Yi Yang. Nas-bench-201: Extending the scope of reproducible neural architecture search. InInternational Conference on Learning Representations, 2020

work page 2020

[11] [11]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.arXiv preprint arXiv:2501.12948, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[12] [12]

Single path one-shot neural architecture search with uniform sampling

Zichao Guo, Xiangyu Zhang, Haoyuan Mu, Wen Heng, Zechun Liu, Yichen Wei, and Jian Sun. Single path one-shot neural architecture search with uniform sampling. InComputer vision–ECCV 2020: 16th European conference, glasgow, UK, August 23–28, 2020, proceedings, part XVI 16, pages 544–560. Springer, 2020

work page 2020

[13] [13]

Sumnas: Supernet with unbiased meta-features for neural architecture search

Hyeonmin Ha, Ji-Hoon Kim, Semin Park, and Byung-Gon Chun. Sumnas: Supernet with unbiased meta-features for neural architecture search. InInternational Conference on Learning Representations, 2022

work page 2022

[14] [14]

Searching for mobilenetv3

Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, et al. Searching for mobilenetv3. InProceedings of the IEEE/CVF international conference on computer vision, pages 1314–1324, 2019

work page 2019

[15] [15]

Greedynasv2: Greedier search with a greedy path filter

Tao Huang, Shan You, Fei Wang, Chen Qian, Changshui Zhang, Xiaogang Wang, and Chang Xu. Greedynasv2: Greedier search with a greedy path filter. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11902–11911, 2022

work page 2022

[16] [16]

Subnet-aware dynamic supernet training for neural architecture search

Jeimin Jeon, Youngmin Oh, Junghyup Lee, Donghyeon Baek, Dohyung Kim, Chanho Eom, and Bumsub Ham. Subnet-aware dynamic supernet training for neural architecture search. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 30137–30146, 2025. 10

work page 2025

[17] [17]

Rz-nas: Enhancing llm-guided neural architecture search via reflective zero-cost strategy

Zipeng Ji, Guanghui Zhu, Chunfeng Yuan, and Yihua Huang. Rz-nas: Enhancing llm-guided neural architecture search via reflective zero-cost strategy. InForty-second International Conference on Machine Learning, 2025

work page 2025

[18] [18]

Gonzalez, Hao Zhang, and Ion Stoica

Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, and Ion Stoica. Efficient memory management for large lan- guage model serving with pagedattention. InProceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles, 2023

work page 2023

[19] [19]

Random search and reproducibility for neural architecture search

Liam Li and Ameet Talwalkar. Random search and reproducibility for neural architecture search. InUncertainty in artificial intelligence, pages 367–377. PMLR, 2020

work page 2020

[20] [20]

DARTS: Differentiable architecture search

Hanxiao Liu, Karen Simonyan, and Yiming Yang. DARTS: Differentiable architecture search. InInternational Conference on Learning Representations, 2019

work page 2019

[21] [21]

Pa&da: Jointly sampling path and data for consistent nas

Shun Lu, Yu Hu, Longxing Yang, Zihao Sun, Jilin Mei, Jianchao Tan, and Chengru Song. Pa&da: Jointly sampling path and data for consistent nas. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11940–11949, 2023

work page 2023

[22] [22]

Shufflenet v2: Practical guidelines for efficient cnn architecture design

Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. Shufflenet v2: Practical guidelines for efficient cnn architecture design. InProceedings of the European conference on computer vision (ECCV), pages 116–131, 2018

work page 2018

[23] [23]

Llmatic: neural architecture search via large language models and quality diversity optimization

Muhammad Umair Nasir, Sam Earle, Julian Togelius, Steven James, and Christopher Cleghorn. Llmatic: neural architecture search via large language models and quality diversity optimization. Inproceedings of the Genetic and Evolutionary Computation Conference, pages 1110–1118, 2024

work page 2024

[24] [24]

Gpt-5.https://openai.com/gpt-5, 2025

OpenAI. Gpt-5.https://openai.com/gpt-5, 2025. Accessed: 2025-09-22

work page 2025

[25] [25]

Introducing openai o3 and o4-mini

OpenAI. Introducing openai o3 and o4-mini. https://openai.com/index/ o3-o4-mini-system-card/, April 2025. Accessed: 2025-09-22

work page 2025

[26] [26]

Regularized evolution for image classifier architecture search

Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. Regularized evolution for image classifier architecture search. InProceedings of the aaai conference on artificial intelligence, volume 33, pages 4780–4789, 2019

work page 2019

[27] [27]

Large-scale evolution of image classifiers

Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V Le, and Alexey Kurakin. Large-scale evolution of image classifiers. InInternational conference on machine learning, pages 2902–2911. PMLR, 2017

work page 2017

[28] [28]

MobileNetV2: Inverted residuals and linear bottlenecks

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. MobileNetV2: Inverted residuals and linear bottlenecks. InCVPR, 2018

work page 2018

[29] [29]

Mnasnet: Platform-aware neural architecture search for mobile

Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V Le. Mnasnet: Platform-aware neural architecture search for mobile. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2820–2828, 2019

work page 2019

[30] [30]

Mingxing Tan and Quoc V . Le. Mixconv: Mixed depthwise convolutional kernels. InBMVC, page 74, 2019

work page 2019

[31] [31]

Qwen3 technical report, 2025

Qwen Team. Qwen3 technical report, 2025

work page 2025

[32] [32]

Evolutionary com- putation in the era of large language model: Survey and roadmap.IEEE Transactions on Evolutionary Computation, 2024

Xingyu Wu, Sheng-hao Wu, Jibin Wu, Liang Feng, and Kay Chen Tan. Evolutionary com- putation in the era of large language model: Survey and roadmap.IEEE Transactions on Evolutionary Computation, 2024

work page 2024

[33] [33]

Pc-darts: Partial channel connections for memory-efficient architecture search

Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, and Hongkai Xiong. Pc-darts: Partial channel connections for memory-efficient architecture search. InInternational Conference on Learning Representations, 2020

work page 2020

[34] [34]

Large language models as optimizers

Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V Le, Denny Zhou, and Xinyun Chen. Large language models as optimizers. InThe Twelfth International Conference on Learning Representations, 2024. 11

work page 2024

[35] [35]

Greedynas: Towards fast one-shot nas with greedy supernet

Shan You, Tao Huang, Mingmin Yang, Fei Wang, Chen Qian, and Changshui Zhang. Greedynas: Towards fast one-shot nas with greedy supernet. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1999–2008, 2020

work page 1999

[36] [36]

Can GPT -4 Perform Neural Architecture Search ?, August 2023

Mingkai Zheng, Xiu Su, Shan You, Fei Wang, Chen Qian, Chang Xu, and Samuel Albanie. Can gpt-4 perform neural architecture search?arXiv preprint arXiv:2304.10970, 2023

work page arXiv 2023

[37] [37]

Neural Architecture Search with Reinforcement Learning

Barret Zoph and Quoc V Le. Neural architecture search with reinforcement learning.arXiv preprint arXiv:1611.01578, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[38] [38]

C.2.2 User Prompt

Barret Zoph, Vijay Vasudevan, Jonathon Shlens, and Quoc V Le. Learning transferable architectures for scalable image recognition. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 8697–8710, 2018. 12 A Key Experimental Settings Table 6: Key experimental settings. "same" indicates identical settings to the corresponding...

work page 2018