For random 2-layer ReLU networks the dominant eigenspaces of the Fisher information matrix are spanned by spherical harmonics of degree ≤2 and capture 97.7% of the trace independently of parameter count.
σ(x⊤Z) Z 2 γ ∥Z∥ # . Writing ˆZ := Z/∥Z∥, we may apply the tower rule and then rewrite the expectation as √ d + 2E
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
stat.ML 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Approximating Simple ReLU Networks based on Spectral Decomposition of Fisher Information
For random 2-layer ReLU networks the dominant eigenspaces of the Fisher information matrix are spanned by spherical harmonics of degree ≤2 and capture 97.7% of the trace independently of parameter count.