Introduces Synergistic Faithfulness metric based on Shapley Interaction Index to evaluate cross-modal synergy in VLM explainers, revealing over-reliance on visual salience in existing methods.
Quantifying attention flow in transformers
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
Small transformers learn to forecast unseen dynamical systems in-context by using delay embeddings to recover the manifold and forecasting its invariant sets via a transfer-operator strategy.
ASAP prunes tokens in ViTs by anchoring on attention sinks modeled as lazy random walks, using cumulative transition matrices and radial diffusion clustering to compress redundancy while preserving accuracy.
Pan-FM learns balanced representations across seven organs by adaptively masking dominant organs during pre-training, yielding stronger disease prediction and missing-organ robustness than single-organ or naive multimodal baselines on UK Biobank.
DAP improves ViT attribution maps by injecting decision-relevant gradients into attention propagation, producing more class-sensitive and faithful explanations than standard attention rollout.
citing papers explorer
-
Measuring Cross-Modal Synergy: A Benchmark for VLM Explainability
Introduces Synergistic Faithfulness metric based on Shapley Interaction Index to evaluate cross-modal synergy in VLM explainers, revealing over-reliance on visual salience in existing methods.
-
Transformers for dynamical systems learn transfer operators in-context
Small transformers learn to forecast unseen dynamical systems in-context by using delay embeddings to recover the manifold and forecasting its invariant sets via a transfer-operator strategy.
-
ASAP: Attention Sink Anchored Pruning
ASAP prunes tokens in ViTs by anchoring on attention sinks modeled as lazy random walks, using cumulative transition matrices and radial diffusion clustering to compress redundancy while preserving accuracy.
-
Pan-FM: A Pan-Organ Foundation Model with Saliency-Guided Masking for Missing Robustness
Pan-FM learns balanced representations across seven organs by adaptively masking dominant organs during pre-training, yielding stronger disease prediction and missing-organ robustness than single-organ or naive multimodal baselines on UK Biobank.
-
Decision-Aware Attention Propagation for Vision Transformer Explainability
DAP improves ViT attribution maps by injecting decision-relevant gradients into attention propagation, producing more class-sensitive and faithful explanations than standard attention rollout.