KANs with learnable univariate spline activations on edges achieve better accuracy than MLPs with fewer parameters, faster scaling, and direct visualization for scientific discovery.
Learning activation functions: A new paradigm for understanding neural networks
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
Mixture of Activations mixes activation functions token-adaptively in FFNs via lightweight gates, strictly more expressive than fixed or learnable activations, and yields lower pretraining loss from 0.12B to 2B models.
The work introduces a modulation-based analytical method for singularity proofs in singular PDEs and refines ML techniques like PINNs and KANs to identify blowup solutions, with application to the open 3D Keller-Segel problem.
citing papers explorer
-
More Expressive Feedforward Layers: Part I. Token-Adaptive Mixing of Activations
Mixture of Activations mixes activation functions token-adaptively in FFNs via lightweight gates, strictly more expressive than fixed or learnable activations, and yields lower pretraining loss from 0.12B to 2B models.
-
Singularity Formation: Synergy in Theoretical, Numerical and Machine Learning Approaches
The work introduces a modulation-based analytical method for singularity proofs in singular PDEs and refines ML techniques like PINNs and KANs to identify blowup solutions, with application to the open 3D Keller-Segel problem.