Dynamicvit: Efﬁcient vision transformers with dynamic token sparsiﬁcation.arXiv preprint arXiv:2106.02034,

Yongming Rao, Wenliang Zhao, Benlin Liu, Jiwen Lu, Jie Zhou, Cho-Jui Hsieh · 2021 · arXiv 2106.02034

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

DC-DiT: Adaptive Compute and Elastic Inference for Visual Generation via Dynamic Chunking

cs.CV · 2026-03-06 · unverdicted · novelty 7.0

DC-DiT learns dynamic chunking to allocate fewer tokens to smooth or noisy regions and more to detailed or late-stage areas, cutting inference FLOPs up to 36.8% while improving FID up to 37.8% on class-conditional ImageNet generation.

MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer

cs.CV · 2021-10-05 · unverdicted · novelty 6.0

MobileViT is a lightweight vision transformer that reports 78.4% top-1 accuracy on ImageNet-1k with ~6M parameters, outperforming MobileNetv3 by 3.2% and DeIT by 6.2% at similar size, plus gains on MS-COCO detection.

citing papers explorer

Showing 2 of 2 citing papers.

DC-DiT: Adaptive Compute and Elastic Inference for Visual Generation via Dynamic Chunking cs.CV · 2026-03-06 · unverdicted · none · ref 13
DC-DiT learns dynamic chunking to allocate fewer tokens to smooth or noisy regions and more to detailed or late-stage areas, cutting inference FLOPs up to 36.8% while improving FID up to 37.8% on class-conditional ImageNet generation.
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer cs.CV · 2021-10-05 · unverdicted · none · ref 15
MobileViT is a lightweight vision transformer that reports 78.4% top-1 accuracy on ImageNet-1k with ~6M parameters, outperforming MobileNetv3 by 3.2% and DeIT by 6.2% at similar size, plus gains on MS-COCO detection.

Dynamicvit: Efﬁcient vision transformers with dynamic token sparsiﬁcation.arXiv preprint arXiv:2106.02034,

fields

years

verdicts

representative citing papers

citing papers explorer