A new framework shows concept subspaces are not unique, estimator choice affects containment and disentanglement, LEACE works well but generalizes poorly, and HuBERT encodes phone info as contained and disentangled from speaker info while speaker info resists compact containment.
and Tolias, Andreas S
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Sparse autoencoders applied to GPT-2 and Llama models recover semantic features accounting for 94% of peak brain encoding performance and map onto distinct cortical semantic regions across three languages.
citing papers explorer
-
A framework for analyzing concept representations in neural models
A new framework shows concept subspaces are not unique, estimator choice affects containment and disentanglement, LEACE works well but generalizes poorly, and HuBERT encodes phone info as contained and disentangled from speaker info while speaker info resists compact containment.
-
Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography
Sparse autoencoders applied to GPT-2 and Llama models recover semantic features accounting for 94% of peak brain encoding performance and map onto distinct cortical semantic regions across three languages.