DARTS+: Improved Differentiable Architecture Search with Early Stopping
read the original abstract
Recently, there has been a growing interest in automating the process of neural architecture design, and the Differentiable Architecture Search (DARTS) method makes the process available within a few GPU days. However, the performance of DARTS is often observed to collapse when the number of search epochs becomes large. Meanwhile, lots of "{\em skip-connect}s" are found in the selected architectures. In this paper, we claim that the cause of the collapse is that there exists overfitting in the optimization of DARTS. Therefore, we propose a simple and effective algorithm, named "DARTS+", to avoid the collapse and improve the original DARTS, by "early stopping" the search procedure when meeting a certain criterion. We also conduct comprehensive experiments on benchmark datasets and different search spaces and show the effectiveness of our DARTS+ algorithm, and DARTS+ achieves $2.32\%$ test error on CIFAR10, $14.87\%$ on CIFAR100, and $23.7\%$ on ImageNet. We further remark that the idea of "early stopping" is implicitly included in some existing DARTS variants by manually setting a small number of search epochs, while we give an {\em explicit} criterion for "early stopping".
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
Harvesting AI Computation at the Edge via Generic Approximation
A framework converts traditional edge tasks to NN models via NAS and schedules them on idle AI chips to improve performance without affecting primary workloads.
-
Implantable Adaptive Cells: A Novel Enhancement for Pre-Trained U-Nets in Medical Image Segmentation
Introduces Implantable Adaptive Cells inserted into pre-trained U-Nets via Partially-Connected DARTS to achieve approximately 5 percentage point gains in segmentation accuracy on four medical MRI/CT datasets.
-
Bilevel Optimization for Neural Architecture Search
Reviews NAS methods through bilevel optimization lens, categorizing them into sampling-based and theory-based, and proposes an auxiliary math programming framework for more principled architecture and weight updates.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.