Isotropic activation functions derived from reparameterisation symmetries and SVD diagonalisation enable function-preserving neuron removal and addition in dense networks, supporting up to 50% sparsification and real-time topology adaptation.
Understanding the difficulty of training deep feedforward neural networks
5 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
A hybrid physics-informed framework using polynomials and neural networks approximates invariant manifolds of discrete-time dynamical systems with nonlinear exosystems and shows higher accuracy than pure polynomial or neural approaches on bioreactor and car-following benchmarks.
SGDA-B is the first backtracking-enabled stochastic GDA algorithm for nonconvex-concave minimax problems that achieves the best known complexity bounds among methods agnostic to L, μ, and σ².
High-probability ergodic and last-iterate complexity guarantees for random reshuffling SGD on smooth nonconvex optimization that match best in-expectation bounds up to logarithmic factors without extra assumptions.
A backpropagation-free training approach for Hamiltonian neural networks via data-driven parameter sampling that claims over 100x CPU speedup and four orders of magnitude better accuracy on chaotic systems like Hénon-Heiles compared to gradient-based methods.
citing papers explorer
-
Isotropic Activation Functions Enable Deindividuated Neurons and Adaptive Topologies
Isotropic activation functions derived from reparameterisation symmetries and SVD diagonalisation enable function-preserving neuron removal and addition in dense networks, supporting up to 50% sparsification and real-time topology adaptation.
-
Invariant Manifolds of Discrete-time Dynamical Systems with Nonlinear Exosystems via Hybrid Physics-Informed Neural Networks
A hybrid physics-informed framework using polynomials and neural networks approximates invariant manifolds of discrete-time dynamical systems with nonlinear exosystems and shows higher accuracy than pure polynomial or neural approaches on bioreactor and car-following benchmarks.
-
A Stochastic GDA Method With Backtracking For Solving Nonconvex Concave Minimax Problems
SGDA-B is the first backtracking-enabled stochastic GDA algorithm for nonconvex-concave minimax problems that achieves the best known complexity bounds among methods agnostic to L, μ, and σ².
-
High Probability Guarantees for Random Reshuffling
High-probability ergodic and last-iterate complexity guarantees for random reshuffling SGD on smooth nonconvex optimization that match best in-expectation bounds up to logarithmic factors without extra assumptions.
-
Training Hamiltonian neural networks without backpropagation
A backpropagation-free training approach for Hamiltonian neural networks via data-driven parameter sampling that claims over 100x CPU speedup and four orders of magnitude better accuracy on chaotic systems like Hénon-Heiles compared to gradient-based methods.