Activation Functions: Comparison of trends in Practice and Research for Deep Learning

· 2018 · cs.LG · arXiv 1811.03378

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open full Pith review browse 6 citing papers arXiv PDF

abstract

Deep neural networks have been successfully used in diverse emerging domains to solve real world complex problems with may more deep learning(DL) architectures, being developed to date. To achieve these state-of-the-art performances, the DL architectures use activation functions (AFs), to perform diverse computations between the hidden layers and the output layers of any given DL architecture. This paper presents a survey on the existing AFs used in deep learning applications and highlights the recent trends in the use of the activation functions for deep learning applications. The novelty of this paper is that it compiles majority of the AFs used in DL and outlines the current trends in the applications and usage of these functions in practical deep learning deployments against the state-of-the-art research results. This compilation will aid in making effective decisions in the choice of the most suitable and appropriate activation function for any given application, ready for deployment. This paper is timely because most research papers on AF highlights similar works and results while this paper will be the first, to compile the trends in AF applications in practice against the research results from literature, found in deep learning research to date.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Probing Memorization of Tabular In-Context Learning

cs.LG · 2026-06-30 · unverdicted · novelty 7.0

A new probing framework detects moderate parametric memorization signals in tabular in-context learning models under single-task fine-tuning, strongest on low-cardinality tasks, but signals largely disappear under realistic training.

SurfDesign: Effective Protein Design on Molecular Surfaces

q-bio.BM · 2026-05-25 · unverdicted · novelty 6.0

SurfDesign introduces surface-conditioned protein design via manifold modeling and equivariant message passing on surfaces integrated with pretrained language models, outperforming prior methods on binder and enzyme design benchmarks.

Physics-Informed Neural Network Modeling of Biodegradable Contaminant Transport through GCL/SL Composite Liners

cs.LG · 2026-06-03 · unverdicted · novelty 5.0

A hard-constrained PINN (H-PINN) achieves lower errors (MAE 0.011-0.023, MRE 2.08-3.14%) than standard PINN when modeling two-domain contaminant transport through GCL/SL liners and supports inverse estimation of degradation half-life.

Expressivity of congruence-based architectures for DNNs on positive-definite matrices

cs.LG · 2026-06-01 · unverdicted · novelty 5.0

Semi-orthogonality in congruence layers for SPD matrix DNNs collapses expressivity to one-hidden-layer equivalents for certain activations due to Poincaré's separation theorem.

Principles and Practice of Deep Representation Learning: or a Mathematical Theory of Memory

cs.LG · 2026-06-04 · unverdicted · novelty 3.0

The book presents principles from optimization and information theory to explain deep network architectures and enable new interpretable models.

Survey on Disaster Management Datasets for Remote Sensing Based Emergency Applications

cs.CV · 2026-05-05 · unverdicted · novelty 3.0

A survey providing an overview of publicly available image-based datasets for ML/DL-based disaster management pipelines covering pre-disaster, during, and post-disaster phases.

citing papers explorer

Showing 4 of 4 citing papers after filters.

Probing Memorization of Tabular In-Context Learning cs.LG · 2026-06-30 · unverdicted · none · ref 112 · internal anchor
A new probing framework detects moderate parametric memorization signals in tabular in-context learning models under single-task fine-tuning, strongest on low-cardinality tasks, but signals largely disappear under realistic training.
Physics-Informed Neural Network Modeling of Biodegradable Contaminant Transport through GCL/SL Composite Liners cs.LG · 2026-06-03 · unverdicted · none · ref 2 · internal anchor
A hard-constrained PINN (H-PINN) achieves lower errors (MAE 0.011-0.023, MRE 2.08-3.14%) than standard PINN when modeling two-domain contaminant transport through GCL/SL liners and supports inverse estimation of degradation half-life.
Expressivity of congruence-based architectures for DNNs on positive-definite matrices cs.LG · 2026-06-01 · unverdicted · none · ref 25 · internal anchor
Semi-orthogonality in congruence layers for SPD matrix DNNs collapses expressivity to one-hidden-layer equivalents for certain activations due to Poincaré's separation theorem.
Principles and Practice of Deep Representation Learning: or a Mathematical Theory of Memory cs.LG · 2026-06-04 · unverdicted · none · ref 70 · internal anchor
The book presents principles from optimization and information theory to explain deep network architectures and enable new interpretable models.

Activation Functions: Comparison of trends in Practice and Research for Deep Learning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer