The Quest for the Golden Activation Function
read the original abstract
Deep Neural Networks have been shown to be beneficial for a variety of tasks, in particular allowing for end-to-end learning and reducing the requirement for manual design decisions. However, still many parameters have to be chosen in advance, also raising the need to optimize them. One important, but often ignored system parameter is the selection of a proper activation function. Thus, in this paper we target to demonstrate the importance of activation functions in general and show that for different tasks different activation functions might be meaningful. To avoid the manual design or selection of activation functions, we build on the idea of genetic algorithms to learn the best activation function for a given task. In addition, we introduce two new activation functions, ELiSH and HardELiSH, which can easily be incorporated in our framework. In this way, we demonstrate for three different image classification benchmarks that different activation functions are learned, also showing improved results compared to typically used baselines.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Neural Network Architecture Search with Differentiable Cartesian Genetic Programming for Regression
dCGPANN encodes neural nets so evolutionary operators can rewire, prune, adapt activations and add skips while gradient descent tunes parameters, yielding smaller networks with lower regression error in fixed time.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.