A theory of non-linear feature learning with one gradient step in two-layer neural networks

Behrad Moniri, Donghwan Lee, Hamed Hassani, Edgar Dobriban · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

The Mechanism of Weak-to-Strong Generalization: Feature Elicitation from Latent Knowledge

stat.ML · 2026-05-13 · unverdicted · novelty 7.0

In two-layer networks, weak-to-strong training elicits the target feature direction from pre-trained subspaces and preserves correlated off-target features, unlike standard fine-tuning.

citing papers explorer

Showing 1 of 1 citing paper.

The Mechanism of Weak-to-Strong Generalization: Feature Elicitation from Latent Knowledge stat.ML · 2026-05-13 · unverdicted · none · ref 42
In two-layer networks, weak-to-strong training elicits the target feature direction from pre-trained subspaces and preserves correlated off-target features, unlike standard fine-tuning.

A theory of non-linear feature learning with one gradient step in two-layer neural networks

fields

years

verdicts

representative citing papers

citing papers explorer