arXiv preprint arXiv:2506.06489 , year =

Alternating gradient flows: A theory of feature learning in two-layer neural networks , author= · 2025 · arXiv 2506.06489

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Neural Networks Provably Learn Spectral Representations for Group Composition

cs.LG · 2026-06-02 · unverdicted · novelty 8.0

Two-layer neural networks provably converge almost surely to irreducible representations of finite groups when trained on the group composition task, with the dynamics governed by Riemannian gradient ascent on a representation-theoretic energy functional.

A Theory of Saddle Escape in Deep Nonlinear Networks

cs.LG · 2026-05-02 · unverdicted · novelty 8.0 · 3 refs

Derives exact Frobenius norm imbalance identity for deep nonlinear networks, classifies activations into four classes, and obtains critical-depth escape time law τ★ = Θ(ε^{-(r-2)}) from reduction to scalar ODE on permutation-symmetric submanifold.

Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

Manifold steering along activation geometry induces behavioral trajectories matching the natural manifold of outputs, while linear steering produces off-manifold unnatural behaviors.

Arithmetic in the Wild: Llama uses Base-10 Addition to Reason About Cyclic Concepts

cs.AI · 2026-05-01 · unverdicted · novelty 7.0

Llama-3.1-8B computes sums for cyclic concepts using base-10 addition via task-agnostic Fourier features with periods 2, 5, and 10 rather than modular arithmetic in the concept period.

There Will Be a Scientific Theory of Deep Learning

stat.ML · 2026-04-23 · unverdicted · novelty 2.0

A mechanics of the learning process is emerging in deep learning theory, characterized by dynamics, coarse statistics, and falsifiable predictions across idealized settings, limits, laws, hyperparameters, and universal behaviors.

citing papers explorer

Showing 2 of 2 citing papers after filters.

A Theory of Saddle Escape in Deep Nonlinear Networks cs.LG · 2026-05-02 · unverdicted · none · ref 29 · 3 links
Derives exact Frobenius norm imbalance identity for deep nonlinear networks, classifies activations into four classes, and obtains critical-depth escape time law τ★ = Θ(ε^{-(r-2)}) from reduction to scalar ODE on permutation-symmetric submanifold.
Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior cs.LG · 2026-05-06 · unverdicted · none · ref 7
Manifold steering along activation geometry induces behavioral trajectories matching the natural manifold of outputs, while linear steering produces off-manifold unnatural behaviors.

arXiv preprint arXiv:2506.06489 , year =

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer