hub

arXiv preprint arXiv:1806.09055 , year=

Darts: Differentiable architecture search , author= · 2018 · cs.LG · arXiv 1806.09055

31 Pith papers cite this work. Polarity classification is still indexing.

31 Pith papers citing it

open full Pith review browse 31 citing papers arXiv PDF

abstract

This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, our method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. Extensive experiments on CIFAR-10, ImageNet, Penn Treebank and WikiText-2 show that our algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques. Our implementation has been made publicly available to facilitate further research on efficient architecture search algorithms.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

AGAN: Towards Automated Design of Generative Adversarial Networks

cs.LG · 2019-06-25 · unverdicted · novelty 8.0

AGAN is the first neural architecture search method for GANs that discovers architectures outperforming state-of-the-art on CIFAR-10 unsupervised image generation and competitive on supervised tasks.

1GC-7RC: One Graphic Card -- Seven Research Challenges! How Good Are AI Agents at Doing Your Job?

cs.LG · 2026-05-16 · unverdicted · novelty 7.0

Introduces the 1GC-7RC benchmark to evaluate AI coding agents on seven diverse ML tasks under single-GPU time and access constraints.

Lattice fermion formulation via Physics-Informed Neural Networks: Ginsparg-Wilson relation and Overlap fermions

hep-lat · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

Physics-Informed Neural Networks construct lattice Dirac operators satisfying the Ginsparg-Wilson relation, reproducing overlap fermions to high accuracy and discovering a Fujikawa-type generalized relation via algebraic search.

Soft Head Selection for Injecting ICL-Derived Task Embeddings

cs.CL · 2025-07-28 · conditional · novelty 7.0

SITE applies soft gradient-based head selection to inject ICL-derived task embeddings, outperforming prior embedding adaptation and few-shot ICL across generation, reasoning, and NLU tasks on 12 LLMs from 4B to 70B parameters.

Switchable Normalization for Learning-to-Normalize Deep Representation

cs.CV · 2019-07-22 · unverdicted · novelty 7.0

Switchable Normalization learns per-layer weights to combine channel, layer, and minibatch normalizers, claiming robustness to batch size and better results than fixed normalizers on ImageNet, COCO, CityScapes, ADE20K, MegaFace, and Kinetics.

NetTailor: Tuning the Architecture, Not Just the Weights

cs.CV · 2019-06-29 · unverdicted · novelty 7.0

NetTailor adapts CNN architecture for new tasks by assembling pre-trained universal blocks with task-specific layers, trained via activation mimicry and complexity penalties to match accuracy while reducing size for simpler tasks.

PACE: Two-Timescale Self-Evolution for Small Language Model Agents

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

PACE coordinates low-risk prompt evolution with validated higher-risk control-logic updates to improve frozen SLM agents on benchmarks without model retraining.

AutoMCU: Feasibility-First MCU Neural Network Customization via LLM-based Multi-Agent Systems

cs.LG · 2026-05-20 · unverdicted · novelty 6.0

AutoMCU uses feasibility-first LLM multi-agent coordination to automate MCU-constrained neural network design, delivering competitive accuracy on CIFAR-10/100 in 1-2 hours versus hundreds of GPU hours for prior HW-NAS methods.

PEML: Parameter-efficient Multi-Task Learning with Optimized Continuous Prompts

cs.CL · 2026-05-13 · unverdicted · novelty 6.0

PEML co-optimizes continuous prompts and low-rank adaptations to deliver up to 6.67% average accuracy gains over existing multi-task PEFT methods on GLUE, SuperGLUE, and other benchmarks.

CHAL: Council of Hierarchical Agentic Language

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

CHAL is a multi-agent dialectic system that performs structured belief optimization over defeasible domains using Bayesian-inspired graph representations and configurable meta-cognitive value system hyperparameters.

Hybrid-LoRA: Bridging Full Fine-Tuning and Low-Rank Adaptation for Post-Training

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

Hybrid-LoRA selectively full fine-tunes modules with high sensitivity to low-rank adaptation using a novel score and applies LoRA elsewhere, matching full fine-tuning at 10% budget and outperforming PEFT baselines by up to 5.65%.

Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization

cs.LG · 2025-11-03 · unverdicted · novelty 6.0

Introduces a novel search direction enabling sublinear stochastic bilevel regret guarantees for first- and zeroth-order online bilevel optimization algorithms without relying on window smoothing.

Quantum Circuit Design using a Progressive Widening Enhanced Monte Carlo Tree Search

quant-ph · 2025-02-06 · unverdicted · novelty 6.0

Progressive widening MCTS with sampling action space automates quantum circuit design, cutting evaluations 10-100x and CNOT gates up to 3x versus prior MCTS on chemistry and linear-equation tasks.

Learnable Parameter Similarity

cs.LG · 2019-07-27 · unverdicted · novelty 6.0

LPS uses a second-order neural network to learn an end-to-end metric for second-order parameter similarity and introduces the ModelSet500 benchmark with 500 trained models.

Video Action Recognition Via Neural Architecture Searching

cs.CV · 2019-07-10 · unverdicted · novelty 6.0

Uses differentiable NAS with temporal segments and pseudo-3D operators to discover a video action recognition network that outperforms hand-designed models on UCF101 with ~1% of the parameters when trained from scratch.

Taming Asynchronous CPU-GPU Coupling for Frequency-aware Latency Estimation on Mobile Edge

cs.AR · 2026-04-11 · unverdicted · novelty 6.0

FLAME models layer-wise overlapping parallelism and asynchronous CPU-GPU pipeline bubbles to estimate inference latency across frequencies with sparse profiling and low error for DNNs and SLMs.

LLaVA-Video: Video Instruction Tuning With Synthetic Data

cs.CV · 2024-10-03 · unverdicted · novelty 6.0

LLaVA-Video-178K is a new synthetic video instruction dataset that, when combined with existing data to train LLaVA-Video, produces strong results on video understanding benchmarks.

TusoAI: Agentic Optimization for Scientific Methods

cs.AI · 2025-09-28 · unverdicted · novelty 5.0

TusoAI is an LLM-based agent that builds and iteratively optimizes domain-specific computational methods for scientific data analysis, outperforming expert baselines on RNA-seq denoising and earth monitoring while reporting new genetic associations.

On Constraint Qualifications for MPECs with Applications to Bilevel Hyperparameter Optimization for Machine Learning

math.OC · 2025-08-18 · unverdicted · novelty 5.0

Clarifies relationships among MPEC constraint qualifications and fully characterizes MPEC-LICQ for the MPEC from bilevel hyperparameter optimization in L1-loss SVM classification.

ORFS-agent: Tool-Using Agents for Chip Design Optimization

cs.AI · 2025-06-10 · unverdicted · novelty 5.0

ORFS-agent uses LLM agents to tune parameters in chip design flows, improving geometric-mean wirelength, clock period, and co-optimization objectives by up to 2.7% over OR-AutoTuner with 40% fewer iterations on ASAP7 and SKY130HD benchmarks.

EPNAS: Efficient Progressive Neural Architecture Search

cs.LG · 2019-07-07 · unverdicted · novelty 5.0

EPNAS uses a progressive search policy with REINFORCE performance prediction to search neural architectures in parallel, supporting multiple resource constraints and outperforming ENAS and PNAS on CIFAR-10 and ImageNet in speed and accuracy.

End-to-end Automated Deep Neural Network Optimization for PPG-based Blood Pressure Estimation on Wearables

cs.LG · 2026-04-11 · unverdicted · novelty 5.0

An end-to-end hardware-aware optimization pipeline produces DNNs for PPG-based blood pressure estimation with up to 7.99% lower error and 83x fewer parameters that fit on ultra-low-power SoCs like GAP8.

Exploring Vision Neural Network Pruning via Screening Methodology

cs.LG · 2025-02-11 · unverdicted · novelty 4.0

A unified F-statistic screening and weighted evaluation method prunes both unstructured and structured parameters in FNNs and CNNs, claiming order-of-magnitude size reduction with competitive accuracy on vision datasets.

Adaptive Reorganization of Neural Pathways for Continual Learning with Spiking Neural Networks

cs.NE · 2023-09-18 · unverdicted · novelty 4.0

SOR-SNN employs Self-Organizing Regulation networks to reorganize a single SNN into sparse pathways, achieving better performance, energy efficiency, memory use, backward transfer, and self-repair on continual learning tasks including CIFAR100 and ImageNet.

citing papers explorer

Showing 31 of 31 citing papers.

AGAN: Towards Automated Design of Generative Adversarial Networks cs.LG · 2019-06-25 · unverdicted · none · ref 30 · internal anchor
AGAN is the first neural architecture search method for GANs that discovers architectures outperforming state-of-the-art on CIFAR-10 unsupervised image generation and competitive on supervised tasks.
1GC-7RC: One Graphic Card -- Seven Research Challenges! How Good Are AI Agents at Doing Your Job? cs.LG · 2026-05-16 · unverdicted · none · ref 28 · internal anchor
Introduces the 1GC-7RC benchmark to evaluate AI coding agents on seven diverse ML tasks under single-GPU time and access constraints.
Lattice fermion formulation via Physics-Informed Neural Networks: Ginsparg-Wilson relation and Overlap fermions hep-lat · 2026-05-07 · unverdicted · none · ref 34 · 2 links · internal anchor
Physics-Informed Neural Networks construct lattice Dirac operators satisfying the Ginsparg-Wilson relation, reproducing overlap fermions to high accuracy and discovering a Fujikawa-type generalized relation via algebraic search.
Soft Head Selection for Injecting ICL-Derived Task Embeddings cs.CL · 2025-07-28 · conditional · none · ref 11 · internal anchor
SITE applies soft gradient-based head selection to inject ICL-derived task embeddings, outperforming prior embedding adaptation and few-shot ICL across generation, reasoning, and NLU tasks on 12 LLMs from 4B to 70B parameters.
Switchable Normalization for Learning-to-Normalize Deep Representation cs.CV · 2019-07-22 · unverdicted · none · ref 41 · internal anchor
Switchable Normalization learns per-layer weights to combine channel, layer, and minibatch normalizers, claiming robustness to batch size and better results than fixed normalizers on ImageNet, COCO, CityScapes, ADE20K, MegaFace, and Kinetics.
NetTailor: Tuning the Architecture, Not Just the Weights cs.CV · 2019-06-29 · unverdicted · none · ref 35 · internal anchor
NetTailor adapts CNN architecture for new tasks by assembling pre-trained universal blocks with task-specific layers, trained via activation mimicry and complexity penalties to match accuracy while reducing size for simpler tasks.
PACE: Two-Timescale Self-Evolution for Small Language Model Agents cs.LG · 2026-05-21 · unverdicted · none · ref 14 · internal anchor
PACE coordinates low-risk prompt evolution with validated higher-risk control-logic updates to improve frozen SLM agents on benchmarks without model retraining.
AutoMCU: Feasibility-First MCU Neural Network Customization via LLM-based Multi-Agent Systems cs.LG · 2026-05-20 · unverdicted · none · ref 36 · internal anchor
AutoMCU uses feasibility-first LLM multi-agent coordination to automate MCU-constrained neural network design, delivering competitive accuracy on CIFAR-10/100 in 1-2 hours versus hundreds of GPU hours for prior HW-NAS methods.
PEML: Parameter-efficient Multi-Task Learning with Optimized Continuous Prompts cs.CL · 2026-05-13 · unverdicted · none · ref 38 · internal anchor
PEML co-optimizes continuous prompts and low-rank adaptations to deliver up to 6.67% average accuracy gains over existing multi-task PEFT methods on GLUE, SuperGLUE, and other benchmarks.
CHAL: Council of Hierarchical Agentic Language cs.AI · 2026-05-12 · unverdicted · none · ref 103 · internal anchor
CHAL is a multi-agent dialectic system that performs structured belief optimization over defeasible domains using Bayesian-inspired graph representations and configurable meta-cognitive value system hyperparameters.
Hybrid-LoRA: Bridging Full Fine-Tuning and Low-Rank Adaptation for Post-Training cs.LG · 2026-05-12 · unverdicted · none · ref 3 · internal anchor
Hybrid-LoRA selectively full fine-tunes modules with high sensitivity to low-rank adaptation using a novel score and applies LoRA elsewhere, matching full fine-tuning at 10% budget and outperforming PEFT baselines by up to 5.65%.
Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization cs.LG · 2025-11-03 · unverdicted · none · ref 52 · internal anchor
Introduces a novel search direction enabling sublinear stochastic bilevel regret guarantees for first- and zeroth-order online bilevel optimization algorithms without relying on window smoothing.
Quantum Circuit Design using a Progressive Widening Enhanced Monte Carlo Tree Search quant-ph · 2025-02-06 · unverdicted · none · ref 24 · internal anchor
Progressive widening MCTS with sampling action space automates quantum circuit design, cutting evaluations 10-100x and CNOT gates up to 3x versus prior MCTS on chemistry and linear-equation tasks.
Learnable Parameter Similarity cs.LG · 2019-07-27 · unverdicted · none · ref 11 · internal anchor
LPS uses a second-order neural network to learn an end-to-end metric for second-order parameter similarity and introduces the ModelSet500 benchmark with 500 trained models.
Video Action Recognition Via Neural Architecture Searching cs.CV · 2019-07-10 · unverdicted · none · ref 19 · internal anchor
Uses differentiable NAS with temporal segments and pseudo-3D operators to discover a video action recognition network that outperforms hand-designed models on UCF101 with ~1% of the parameters when trained from scratch.
Taming Asynchronous CPU-GPU Coupling for Frequency-aware Latency Estimation on Mobile Edge cs.AR · 2026-04-11 · unverdicted · none · ref 44
FLAME models layer-wise overlapping parallelism and asynchronous CPU-GPU pipeline bubbles to estimate inference latency across frequencies with sparse profiling and low error for DNNs and SLMs.
LLaVA-Video: Video Instruction Tuning With Synthetic Data cs.CV · 2024-10-03 · unverdicted · none · ref 95
LLaVA-Video-178K is a new synthetic video instruction dataset that, when combined with existing data to train LLaVA-Video, produces strong results on video understanding benchmarks.
TusoAI: Agentic Optimization for Scientific Methods cs.AI · 2025-09-28 · unverdicted · none · ref 23 · internal anchor
TusoAI is an LLM-based agent that builds and iteratively optimizes domain-specific computational methods for scientific data analysis, outperforming expert baselines on RNA-seq denoising and earth monitoring while reporting new genetic associations.
On Constraint Qualifications for MPECs with Applications to Bilevel Hyperparameter Optimization for Machine Learning math.OC · 2025-08-18 · unverdicted · none · ref 9 · internal anchor
Clarifies relationships among MPEC constraint qualifications and fully characterizes MPEC-LICQ for the MPEC from bilevel hyperparameter optimization in L1-loss SVM classification.
ORFS-agent: Tool-Using Agents for Chip Design Optimization cs.AI · 2025-06-10 · unverdicted · none · ref 33 · internal anchor
ORFS-agent uses LLM agents to tune parameters in chip design flows, improving geometric-mean wirelength, clock period, and co-optimization objectives by up to 2.7% over OR-AutoTuner with 40% fewer iterations on ASAP7 and SKY130HD benchmarks.
EPNAS: Efficient Progressive Neural Architecture Search cs.LG · 2019-07-07 · unverdicted · none · ref 28 · internal anchor
EPNAS uses a progressive search policy with REINFORCE performance prediction to search neural architectures in parallel, supporting multiple resource constraints and outperforming ENAS and PNAS on CIFAR-10 and ImageNet in speed and accuracy.
End-to-end Automated Deep Neural Network Optimization for PPG-based Blood Pressure Estimation on Wearables cs.LG · 2026-04-11 · unverdicted · none · ref 48
An end-to-end hardware-aware optimization pipeline produces DNNs for PPG-based blood pressure estimation with up to 7.99% lower error and 83x fewer parameters that fit on ultra-low-power SoCs like GAP8.
Exploring Vision Neural Network Pruning via Screening Methodology cs.LG · 2025-02-11 · unverdicted · none · ref 33 · internal anchor
A unified F-statistic screening and weighted evaluation method prunes both unstructured and structured parameters in FNNs and CNNs, claiming order-of-magnitude size reduction with competitive accuracy on vision datasets.
Adaptive Reorganization of Neural Pathways for Continual Learning with Spiking Neural Networks cs.NE · 2023-09-18 · unverdicted · none · ref 65 · internal anchor
SOR-SNN employs Self-Organizing Regulation networks to reorganize a single SNN into sparse pathways, achieving better performance, energy efficiency, memory use, backward transfer, and self-repair on continual learning tasks including CIFAR100 and ImageNet.
Self-Adaptive 2D-3D Ensemble of Fully Convolutional Networks for Medical Image Segmentation eess.IV · 2019-07-26 · unverdicted · none · ref 18 · internal anchor
Self-adaptive 2D-3D FCN ensemble optimized by multiobjective evolution for prostate segmentation on PROMISE12 achieves top-10 ranking with smaller size than prior auto-designed models.
Deployment-Aligned Low-Precision Neural Architecture Search for Spaceborne Edge AI cs.CV · 2026-04-27 · unverdicted · none · ref 27
Deployment-aligned low-precision NAS recovers about two-thirds of the accuracy drop from post-training quantization, achieving 0.826 mIoU on-device for a 95k-parameter model on Intel Movidius Myriad X without added complexity.
Genetic Deep Learning for Lung Cancer Screening cs.CV · 2019-07-27 · unverdicted · none · ref 17 · internal anchor
Genetic algorithm designs a CNN for lung cancer detection in CXRs achieving 97.15% accuracy, outperforming Inception-V3 and ResNet-152 with 4x and 14x fewer parameters.
Genetic Network Architecture Search cs.NE · 2019-07-05 · unverdicted · none · ref 5 · internal anchor
Genetic algorithm searches convolution cell architectures with weight sharing via SGD, reporting 96% accuracy on CIFAR10 and 80.1% on CIFAR100.
Spiking Neural Network Architecture Search: A Survey cs.NE · 2025-10-16 · unverdicted · none · ref 144 · internal anchor
A survey of Spiking Neural Network architecture search techniques viewed through a hardware/software co-design lens.
SURGE: Surrogate Gradient Adaptation in Binary Neural Networks cs.LG · 2026-05-09 · unreviewed · ref 119 · 2 links · internal anchor
AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery cs.CL · 2026-04-07 · unreviewed · ref 14

arXiv preprint arXiv:1806.09055 , year=

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer