Quantifying attention flow in transformers

Samira Abnar, Willem Zuidema · 2020

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Measuring Cross-Modal Synergy: A Benchmark for VLM Explainability

cs.AI · 2026-05-21 · unverdicted · novelty 7.0

Introduces Synergistic Faithfulness metric based on Shapley Interaction Index to evaluate cross-modal synergy in VLM explainers, revealing over-reliance on visual salience in existing methods.

Transformers for dynamical systems learn transfer operators in-context

cs.LG · 2026-02-21 · unverdicted · novelty 6.0

Small transformers learn to forecast unseen dynamical systems in-context by using delay embeddings to recover the manifold and forecasting its invariant sets via a transfer-operator strategy.

ASAP: Attention Sink Anchored Pruning

cs.LG · 2026-05-21 · unverdicted · novelty 5.0

ASAP prunes tokens in ViTs by anchoring on attention sinks modeled as lazy random walks, using cumulative transition matrices and radial diffusion clustering to compress redundancy while preserving accuracy.

Pan-FM: A Pan-Organ Foundation Model with Saliency-Guided Masking for Missing Robustness

cs.CV · 2026-05-08 · unverdicted · novelty 5.0

Pan-FM learns balanced representations across seven organs by adaptively masking dominant organs during pre-training, yielding stronger disease prediction and missing-organ robustness than single-organ or naive multimodal baselines on UK Biobank.

Decision-Aware Attention Propagation for Vision Transformer Explainability

cs.CV · 2026-04-20 · unverdicted · novelty 5.0

DAP improves ViT attribution maps by injecting decision-relevant gradients into attention propagation, producing more class-sensitive and faithful explanations than standard attention rollout.

citing papers explorer

Showing 5 of 5 citing papers.

Measuring Cross-Modal Synergy: A Benchmark for VLM Explainability cs.AI · 2026-05-21 · unverdicted · none · ref 21
Introduces Synergistic Faithfulness metric based on Shapley Interaction Index to evaluate cross-modal synergy in VLM explainers, revealing over-reliance on visual salience in existing methods.
Transformers for dynamical systems learn transfer operators in-context cs.LG · 2026-02-21 · unverdicted · none · ref 38
Small transformers learn to forecast unseen dynamical systems in-context by using delay embeddings to recover the manifold and forecasting its invariant sets via a transfer-operator strategy.
ASAP: Attention Sink Anchored Pruning cs.LG · 2026-05-21 · unverdicted · none · ref 6
ASAP prunes tokens in ViTs by anchoring on attention sinks modeled as lazy random walks, using cumulative transition matrices and radial diffusion clustering to compress redundancy while preserving accuracy.
Pan-FM: A Pan-Organ Foundation Model with Saliency-Guided Masking for Missing Robustness cs.CV · 2026-05-08 · unverdicted · none · ref 1
Pan-FM learns balanced representations across seven organs by adaptively masking dominant organs during pre-training, yielding stronger disease prediction and missing-organ robustness than single-organ or naive multimodal baselines on UK Biobank.
Decision-Aware Attention Propagation for Vision Transformer Explainability cs.CV · 2026-04-20 · unverdicted · none · ref 12
DAP improves ViT attribution maps by injecting decision-relevant gradients into attention propagation, producing more class-sensitive and faithful explanations than standard attention rollout.

Quantifying attention flow in transformers

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer