Understanding the difficulty of training deep feedforward neural networks

Xavier Glorot, Yoshua Bengio · 2010

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

representative citing papers

Isotropic Activation Functions Enable Deindividuated Neurons and Adaptive Topologies

cs.NE · 2026-02-26 · unverdicted · novelty 7.0

Isotropic activation functions derived from reparameterisation symmetries and SVD diagonalisation enable function-preserving neuron removal and addition in dense networks, supporting up to 50% sparsification and real-time topology adaptation.

Invariant Manifolds of Discrete-time Dynamical Systems with Nonlinear Exosystems via Hybrid Physics-Informed Neural Networks

math.NA · 2025-06-16 · unverdicted · novelty 7.0

A hybrid physics-informed framework using polynomials and neural networks approximates invariant manifolds of discrete-time dynamical systems with nonlinear exosystems and shows higher accuracy than pure polynomial or neural approaches on bioreactor and car-following benchmarks.

A Stochastic GDA Method With Backtracking For Solving Nonconvex Concave Minimax Problems

math.OC · 2024-03-12 · unverdicted · novelty 7.0

SGDA-B is the first backtracking-enabled stochastic GDA algorithm for nonconvex-concave minimax problems that achieves the best known complexity bounds among methods agnostic to L, μ, and σ².

High Probability Guarantees for Random Reshuffling

math.OC · 2023-11-20 · unverdicted · novelty 7.0

High-probability ergodic and last-iterate complexity guarantees for random reshuffling SGD on smooth nonconvex optimization that match best in-expectation bounds up to logarithmic factors without extra assumptions.

Training Hamiltonian neural networks without backpropagation

cs.LG · 2024-11-26 · conditional · novelty 6.0

A backpropagation-free training approach for Hamiltonian neural networks via data-driven parameter sampling that claims over 100x CPU speedup and four orders of magnitude better accuracy on chaotic systems like Hénon-Heiles compared to gradient-based methods.

citing papers explorer

Showing 5 of 5 citing papers.

Isotropic Activation Functions Enable Deindividuated Neurons and Adaptive Topologies cs.NE · 2026-02-26 · unverdicted · none · ref 27
Isotropic activation functions derived from reparameterisation symmetries and SVD diagonalisation enable function-preserving neuron removal and addition in dense networks, supporting up to 50% sparsification and real-time topology adaptation.
Invariant Manifolds of Discrete-time Dynamical Systems with Nonlinear Exosystems via Hybrid Physics-Informed Neural Networks math.NA · 2025-06-16 · unverdicted · none · ref 98
A hybrid physics-informed framework using polynomials and neural networks approximates invariant manifolds of discrete-time dynamical systems with nonlinear exosystems and shows higher accuracy than pure polynomial or neural approaches on bioreactor and car-following benchmarks.
A Stochastic GDA Method With Backtracking For Solving Nonconvex Concave Minimax Problems math.OC · 2024-03-12 · unverdicted · none · ref 16
SGDA-B is the first backtracking-enabled stochastic GDA algorithm for nonconvex-concave minimax problems that achieves the best known complexity bounds among methods agnostic to L, μ, and σ².
High Probability Guarantees for Random Reshuffling math.OC · 2023-11-20 · unverdicted · none · ref 10
High-probability ergodic and last-iterate complexity guarantees for random reshuffling SGD on smooth nonconvex optimization that match best in-expectation bounds up to logarithmic factors without extra assumptions.
Training Hamiltonian neural networks without backpropagation cs.LG · 2024-11-26 · conditional · none · ref 11
A backpropagation-free training approach for Hamiltonian neural networks via data-driven parameter sampling that claims over 100x CPU speedup and four orders of magnitude better accuracy on chaotic systems like Hénon-Heiles compared to gradient-based methods.

Understanding the difficulty of training deep feedforward neural networks

fields

years

verdicts

representative citing papers

citing papers explorer