Preprint, arXiv:2112.07658

· 2021 · arXiv 2112.07658

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Token-Sparse Medical Multimodal Reasoning via Dual-Stream Reinforcement Learning

cs.CV · 2026-06-30 · unverdicted · novelty 6.0

ViToS uses dual-stream RL with cross-feedback optimization to prune medical image tokens to 77% length while reporting 108.27% and 104.16% relative performance on two 7B VLMs across seven benchmarks.

When Attention Collapses: Stage-Aware Visual Token Pruning from Structure to Semantics

cs.CV · 2026-06-02 · unverdicted · novelty 6.0

STS is a two-stage pruning framework that decouples structural diversity via repulsion sampling from semantic filtering via cross-attention to reduce redundancy in visual tokens for VLMs.

ViT-FREE: Efficient Face Recognition via Early Exiting and Synthetic Adaptation

cs.CV · 2026-06-10 · conditional · novelty 4.0

ViT-FREE enables early exiting from pretrained ViTs for face verification with up to 20% speedup and 1.5 accuracy drop on IJB-C, plus a synthetic-data fine-tuning variant for shallow exits.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Token-Sparse Medical Multimodal Reasoning via Dual-Stream Reinforcement Learning cs.CV · 2026-06-30 · unverdicted · none · ref 23
ViToS uses dual-stream RL with cross-feedback optimization to prune medical image tokens to 77% length while reporting 108.27% and 104.16% relative performance on two 7B VLMs across seven benchmarks.
When Attention Collapses: Stage-Aware Visual Token Pruning from Structure to Semantics cs.CV · 2026-06-02 · unverdicted · none · ref 28
STS is a two-stage pruning framework that decouples structural diversity via repulsion sampling from semantic filtering via cross-attention to reduce redundancy in visual tokens for VLMs.
ViT-FREE: Efficient Face Recognition via Early Exiting and Synthetic Adaptation cs.CV · 2026-06-10 · conditional · none · ref 52
ViT-FREE enables early exiting from pretrained ViTs for face verification with up to 20% speedup and 1.5 accuracy drop on IJB-C, plus a synthetic-data fine-tuning variant for shallow exits.

Preprint, arXiv:2112.07658

fields

years

verdicts

representative citing papers

citing papers explorer