Enhancing neural network interpretability with feature-aligned sparse autoencoders

Luke Marks, Alasdair Paren, David Krueger, Fazl Barez · 2024 · arXiv 2411.01220

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Aligned Training: A Parameter-Free Method to Improve Feature Quality and Stability of Sparse Autoencoders (SAE)

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

Aligned training reparameterizes SAEs to enforce unit inner product between encoder and decoder directions, eliminating dead features and enhancing stability without hyperparameters.

Improving Sparse Autoencoder with Dynamic Attention

cs.LG · 2026-04-16 · unverdicted · novelty 7.0

A cross-attention SAE with sparsemax attention achieves lower reconstruction loss and higher-quality concepts than fixed-sparsity baselines by making activation counts data-dependent.

Sparse Autoencoders as a Steering Basis for Phase Synchronization in Graph-Based CFD Surrogates

cs.CE · 2026-03-28 · unverdicted · novelty 6.0

Sparse autoencoders enable phase synchronization in frozen graph CFD surrogates through Hilbert-identified oscillatory features and SVD-based time-varying rotations.

citing papers explorer

Showing 3 of 3 citing papers.

Aligned Training: A Parameter-Free Method to Improve Feature Quality and Stability of Sparse Autoencoders (SAE) cs.LG · 2026-05-18 · unverdicted · none · ref 21
Aligned training reparameterizes SAEs to enforce unit inner product between encoder and decoder directions, eliminating dead features and enhancing stability without hyperparameters.
Improving Sparse Autoencoder with Dynamic Attention cs.LG · 2026-04-16 · unverdicted · none · ref 38
A cross-attention SAE with sparsemax attention achieves lower reconstruction loss and higher-quality concepts than fixed-sparsity baselines by making activation counts data-dependent.
Sparse Autoencoders as a Steering Basis for Phase Synchronization in Graph-Based CFD Surrogates cs.CE · 2026-03-28 · unverdicted · none · ref 7
Sparse autoencoders enable phase synchronization in frozen graph CFD surrogates through Hilbert-identified oscillatory features and SVD-based time-varying rotations.

Enhancing neural network interpretability with feature-aligned sparse autoencoders

fields

years

verdicts

representative citing papers

citing papers explorer