Recognition: unknown
Neural Architecture Search with Reinforcement Learning
read the original abstract
Neural networks are powerful and flexible models that work well for many difficult learning tasks in image, speech and natural language understanding. Despite their success, neural networks are still hard to design. In this paper, we use a recurrent network to generate the model descriptions of neural networks and train this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set. On the CIFAR-10 dataset, our method, starting from scratch, can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy. Our CIFAR-10 model achieves a test error rate of 3.65, which is 0.09 percent better and 1.05x faster than the previous state-of-the-art model that used a similar architectural scheme. On the Penn Treebank dataset, our model can compose a novel recurrent cell that outperforms the widely-used LSTM cell, and other state-of-the-art baselines. Our cell achieves a test set perplexity of 62.4 on the Penn Treebank, which is 3.6 perplexity better than the previous state-of-the-art model. The cell can also be transferred to the character language modeling task on PTB and achieves a state-of-the-art perplexity of 1.214.
This paper has not been read by Pith yet.
Forward citations
Cited by 12 Pith papers
-
LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling
AutoTTS discovers width-depth test-time scaling controllers through agentic search in a pre-collected trajectory environment, yielding better accuracy-cost tradeoffs than hand-designed baselines on math reasoning task...
-
LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling
AutoTTS discovers superior test-time scaling strategies for LLMs via cheap controller synthesis in a pre-collected trajectory environment, outperforming manual baselines on math benchmarks with low discovery cost.
-
AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery
AutoSOTA uses eight specialized agents to replicate and optimize models from recent AI papers, producing 105 new SOTA results in about five hours per paper on average.
-
RELO: Reinforcement Learning to Localize for Visual Object Tracking
RELO replaces handcrafted spatial priors with a reinforcement learning policy for target localization in visual tracking and reports 57.5% AUC on LaSOText without template updates.
-
OMEGA: Optimizing Machine Learning by Evaluating Generated Algorithms
OMEGA framework generates novel ML classifiers via meta-prompts and executable code that outperform scikit-learn baselines on 20 benchmark datasets.
-
TRON: Trainable, architecture-reconfigurable random optical neural networks
TRON demonstrates a trainable and reconfigurable optical neural network that combines multi-scattering media with DMD-based matrix multiplication and performs in-situ optimization plus neural architecture search on th...
-
LLaVA-Video: Video Instruction Tuning With Synthetic Data
LLaVA-Video-178K is a new synthetic video instruction dataset that, when combined with existing data to train LLaVA-Video, produces strong results on video understanding benchmarks.
-
Heterogeneous Connectivity in Sparse Networks: Fan-in Profiles, Gradient Hierarchy, and Topological Equilibria
Arbitrary heterogeneous fan-in profiles in sparse networks match uniform random accuracy at high sparsity, but initializing RigL dynamic sparse training with equilibrium-matched lognormal profiles improves performance...
-
From LLM to Silicon: RL-Driven ASIC Architecture Exploration for On-Device AI Inference
An RL agent using Soft Actor-Critic with Mixture-of-Experts jointly optimizes ASIC architecture, memory hierarchy, and partitioning for AI inference, achieving 29809 tokens/s for Llama 3.1 at 3nm and under 13mW for Sm...
-
Efficient Accelerated Graph Edit Distance Computation on GPU
FAST-GED delivers orders-of-magnitude speedups over NetworkX for graph edit distance on GPUs while often reaching optimal solutions and outperforming approximate methods.
-
BayMOTH: Bayesian optiMizatiOn with meTa-lookahead -- a simple approacH
BayMOTH unifies meta-Bayesian optimization with a usefulness-based fallback to lookahead, demonstrating competitive results on function optimization tasks even under low task relatedness.
-
Split and Aggregation Learning for Foundation Models Over Mobile Embodied AI Network (MEAN): A Comprehensive Survey
The paper surveys split and aggregation learning for foundation models in 6G networks to improve efficiency, resource use, and data privacy in distributed AI.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.