Token order in frozen visual representations is exploitable via SSM-based LTI probes, revealing pre-training-dependent heterogeneity that fixed pooling misses.
Ppt: Token pruning and pooling for efficient vision transformers
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
MPM merges mutual nearest-neighbor token pairs in cosine space for ViTs, records a merge map for reconstruction, and delivers up to 60% latency reduction on Raspberry Pi 5 and 20% throughput gain on H100 with under 3% mIoU drop on ADE20K.
ASAP prunes tokens in ViTs by anchoring on attention sinks modeled as lazy random walks, using cumulative transition matrices and radial diffusion clustering to compress redundancy while preserving accuracy.
STEP uses dynamic superpatch merging via dCTS and early token exits to cut token count by 2.5x and computational complexity by up to 4x on ViT-Large for high-res segmentation, with at most 2% accuracy drop and 40% tokens halted early.
citing papers explorer
-
Rethink MAE with Linear Time-Invariant Dynamics
Token order in frozen visual representations is exploitable via SSM-based LTI probes, revealing pre-training-dependent heterogeneity that fixed pooling misses.
-
MPM: Mutual Pair Merging for Efficient Vision Transformers
MPM merges mutual nearest-neighbor token pairs in cosine space for ViTs, records a merge map for reconstruction, and delivers up to 60% latency reduction on Raspberry Pi 5 and 20% throughput gain on H100 with under 3% mIoU drop on ADE20K.
-
ASAP: Attention Sink Anchored Pruning
ASAP prunes tokens in ViTs by anchoring on attention sinks modeled as lazy random walks, using cumulative transition matrices and radial diffusion clustering to compress redundancy while preserving accuracy.
-
Where Do Tokens Go? Understanding Pruning Behaviors in STEP at High Resolutions
STEP uses dynamic superpatch merging via dCTS and early token exits to cut token count by 2.5x and computational complexity by up to 4x on ViT-Large for high-res segmentation, with at most 2% accuracy drop and 40% tokens halted early.