Dead directions recover Watanabe's RLCT contribution and triple (λ, m, ν) from directional Fisher curvature decay rates in original parameter space for singular models, extended via K-FAC to networks and gauge-equivariant optimizers.
Dynamical versus Bayesian phase transitions in a toy model of superposition
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.LG 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
Small transformers on HMM prediction tasks exhibit correlated scaling between performance and linear encoding of belief distributions in residual activations.
citing papers explorer
-
Dead Directions: Geometric Singular Learning
Dead directions recover Watanabe's RLCT contribution and triple (λ, m, ν) from directional Fisher curvature decay rates in original parameter space for singular models, extended via K-FAC to networks and gauge-equivariant optimizers.
-
Correcting Influence: Unboxing LLM Outputs with Orthogonal Latent Spaces
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
-
Structure and Scale in Simplicial Sequence Modelling
Small transformers on HMM prediction tasks exhibit correlated scaling between performance and linear encoding of belief distributions in residual activations.