Identifies activation subspace bottlenecks in SSMs and demonstrates that scalar scaling of these subspaces at test time yields 8.27% average gains across 7 models and 6 benchmarks, plus an improved Stable-Mamba architecture.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Interpreting and Steering State-Space Models via Activation Subspace Bottlenecks
Identifies activation subspace bottlenecks in SSMs and demonstrates that scalar scaling of these subspaces at test time yields 8.27% average gains across 7 models and 6 benchmarks, plus an improved Stable-Mamba architecture.