Cp-vit: Cascade vision transformer pruning via progressive sparsity prediction

Zhuoran Song, Yihong Xu, Zhezhi He, Li Jiang, Naifeng Jing, Xiaoyao Liang · 2022 · arXiv 2203.04570

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

cs.CV · 2022-10-17 · unverdicted · novelty 6.0

Token Merging (ToMe) doubles the throughput of large Vision Transformers on images, video, and audio by merging similar tokens with a fast matching algorithm, incurring only 0.2-0.4% accuracy loss.

Temporal Aware Pruning for Efficient Diffusion-based Video Generation

cs.CV · 2026-05-18 · unverdicted · novelty 5.0 · 2 refs

TAPE applies temporal-aware token pruning with smoothing, reselection, and timestep scheduling to speed up video diffusion models while preserving visual fidelity and coherence.

Where Do Tokens Go? Understanding Pruning Behaviors in STEP at High Resolutions

cs.CV · 2025-09-17 · unverdicted · novelty 5.0

STEP uses dynamic superpatch merging via dCTS and early token exits to cut token count by 2.5x and computational complexity by up to 4x on ViT-Large for high-res segmentation, with at most 2% accuracy drop and 40% tokens halted early.

AutoSculpt: A Pattern-based Model Auto-pruning Framework Using Reinforcement Learning and Graph Learning

cs.AI · 2024-12-24 · unverdicted · novelty 5.0

AutoSculpt models DNNs as graphs, embeds pruning patterns, and uses deep reinforcement learning to reach up to 90% pruning and 18% better FLOPs reduction than baselines on ResNet, MobileNet, VGG, and Vision Transformers.

citing papers explorer

Showing 4 of 4 citing papers.

Token Merging: Your ViT But Faster cs.CV · 2022-10-17 · unverdicted · none · ref 9
Token Merging (ToMe) doubles the throughput of large Vision Transformers on images, video, and audio by merging similar tokens with a fast matching algorithm, incurring only 0.2-0.4% accuracy loss.
Temporal Aware Pruning for Efficient Diffusion-based Video Generation cs.CV · 2026-05-18 · unverdicted · none · ref 90 · 2 links
TAPE applies temporal-aware token pruning with smoothing, reselection, and timestep scheduling to speed up video diffusion models while preserving visual fidelity and coherence.
Where Do Tokens Go? Understanding Pruning Behaviors in STEP at High Resolutions cs.CV · 2025-09-17 · unverdicted · none · ref 31
STEP uses dynamic superpatch merging via dCTS and early token exits to cut token count by 2.5x and computational complexity by up to 4x on ViT-Large for high-res segmentation, with at most 2% accuracy drop and 40% tokens halted early.
AutoSculpt: A Pattern-based Model Auto-pruning Framework Using Reinforcement Learning and Graph Learning cs.AI · 2024-12-24 · unverdicted · none · ref 44
AutoSculpt models DNNs as graphs, embeds pruning patterns, and uses deep reinforcement learning to reach up to 90% pruning and 18% better FLOPs reduction than baselines on ResNet, MobileNet, VGG, and Vision Transformers.

Cp-vit: Cascade vision transformer pruning via progressive sparsity prediction

fields

years

verdicts

representative citing papers

citing papers explorer