C-GSPN scales 2D spatial propagation to foundation vision encoders via a fast CUDA kernel, compressed blocks, and two-stage distillation, matching ViT performance with 15% fewer parameters and 4x block speedup at 2K resolution.
Multi-scale vmamba: Hierarchy in hierarchy visual state space model
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
Benchmarks Vision Mamba variants for AI-generated image detection against CNN, ViT, and VLM detectors on diverse datasets and synthetic sources, reporting promise alongside limitations.
The paper consolidates existing research on Mamba models, their architecture variants, adaptations to different data modalities, and applications across domains.
citing papers explorer
-
Scaling Parallel Sequence Models to Foundation-Scale Vision Encoders
C-GSPN scales 2D spatial propagation to foundation vision encoders via a fast CUDA kernel, compressed blocks, and two-stage distillation, matching ViT performance with 15% fewer parameters and 4x block speedup at 2K resolution.
-
Can Visual Mamba Improve AI-Generated Image Detection? An In-Depth Investigation
Benchmarks Vision Mamba variants for AI-generated image detection against CNN, ViT, and VLM detectors on diverse datasets and synthetic sources, reporting promise alongside limitations.
-
A Survey of Mamba
The paper consolidates existing research on Mamba models, their architecture variants, adaptations to different data modalities, and applications across domains.