GAE reduces the faithfulness gap in dictionary-based explainers under distribution shift by geometrically realigning the ID dictionary to the OOD-active subspace, with a quadratic excess-loss bound.
Svcca: Singular vector canonical correlation analysis for deep learning dynamics and interpretability
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2verdicts
UNVERDICTED 2representative citing papers
TOAST approximates full transformer blocks in pretrained models via lightweight closed-form mappings to cut parameters and FLOPs without retraining or finetuning.
citing papers explorer
-
Geometry-Adaptive Explainer for Faithful Dictionary-Based Interpretability under Distribution Shift
GAE reduces the faithfulness gap in dictionary-based explainers under distribution shift by geometrically realigning the ID dictionary to the OOD-active subspace, with a quadratic excess-loss bound.
-
TOAST: Transformer Optimization using Adaptive and Simple Transformations
TOAST approximates full transformer blocks in pretrained models via lightweight closed-form mappings to cut parameters and FLOPs without retraining or finetuning.