Zamba2-VL is a family of 1.2B–7B hybrid Mamba2-transformer vision-language models that match leading transformer VLMs on image, reasoning, OCR, grounding and counting benchmarks while delivering roughly 10x lower time-to-first-token.
Cobra: Extending mamba to multi-modal large language model for efficient inference,
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
MambaADv2 evolves Mamba state space models with hybrid blocks, frequency convolutions, and adaptive scanning for improved unsupervised anomaly detection.
The paper consolidates existing research on Mamba models, their architecture variants, adaptations to different data modalities, and applications across domains.
citing papers explorer
-
Zamba2-VL Technical Report
Zamba2-VL is a family of 1.2B–7B hybrid Mamba2-transformer vision-language models that match leading transformer VLMs on image, reasoning, OCR, grounding and counting benchmarks while delivering roughly 10x lower time-to-first-token.
-
MambaADv2: Evolving Duality-enhanced State Space Model for Unsupervised Anomaly Detection
MambaADv2 evolves Mamba state space models with hybrid blocks, frequency convolutions, and adaptive scanning for improved unsupervised anomaly detection.