OASIS mitigates attention sinks and outliers in AttnResidual models via Softmax1 null space and inter-layer signals, reporting norm and kurtosis reductions plus large gains in quantized perplexity and task accuracy.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Attention Sinks and Outliers in Attention Residuals
OASIS mitigates attention sinks and outliers in AttnResidual models via Softmax1 null space and inter-layer signals, reporting norm and kurtosis reductions plus large gains in quantized perplexity and task accuracy.