Mamba in vision: A comprehensive survey of techniques and applications

· 2024 · arXiv 2410.03105

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

ViM-Q: Scalable Algorithm-Hardware Co-Design for Vision Mamba Model Inference on FPGA

cs.AR · 2026-05-03 · unverdicted · novelty 7.0

ViM-Q delivers 4.96x speedup and 59.8x energy efficiency for Vision Mamba inference on FPGA versus a quantized GPU baseline using dynamic activation quantization, per-block APoT weights, and a pipelined SSM engine.

ABMAMBA: Multimodal Large Language Model with Aligned Hierarchical Bidirectional Scan for Efficient Video Captioning

cs.CV · 2026-04-09 · unverdicted · novelty 6.0

ABMamba uses Mamba-based linear-complexity processing plus a novel Aligned Hierarchical Bidirectional Scan to deliver competitive video captioning on VATEX and MSR-VTT at roughly 3x higher throughput than typical Transformer MLLMs.

MambaLiteUNet: Cross-Gated Adaptive Feature Fusion for Robust Skin Lesion Segmentation

cs.CV · 2026-04-22 · unverdicted · novelty 5.0

MambaLiteUNet integrates Mamba into U-Net with adaptive fusion, local-global mixing, and cross-gated attention modules to reach 87.12% IoU and 93.09% Dice on skin lesion datasets while cutting parameters by 93.6%.

HiFi-Mamba: Dual-Stream W-Laplacian Enhanced Mamba for High-Fidelity MRI Reconstruction

eess.IV · 2025-08-07 · unverdicted · novelty 5.0

HiFi-Mamba uses stacked W-Laplacian spectral decoupling and unidirectional HiFi-Mamba blocks to improve high-frequency detail preservation and efficiency over prior Mamba, CNN, and Transformer models for MRI reconstruction.

citing papers explorer

Showing 4 of 4 citing papers.

ViM-Q: Scalable Algorithm-Hardware Co-Design for Vision Mamba Model Inference on FPGA cs.AR · 2026-05-03 · unverdicted · none · ref 19
ViM-Q delivers 4.96x speedup and 59.8x energy efficiency for Vision Mamba inference on FPGA versus a quantized GPU baseline using dynamic activation quantization, per-block APoT weights, and a pipelined SSM engine.
ABMAMBA: Multimodal Large Language Model with Aligned Hierarchical Bidirectional Scan for Efficient Video Captioning cs.CV · 2026-04-09 · unverdicted · none · ref 50
ABMamba uses Mamba-based linear-complexity processing plus a novel Aligned Hierarchical Bidirectional Scan to deliver competitive video captioning on VATEX and MSR-VTT at roughly 3x higher throughput than typical Transformer MLLMs.
MambaLiteUNet: Cross-Gated Adaptive Feature Fusion for Robust Skin Lesion Segmentation cs.CV · 2026-04-22 · unverdicted · none · ref 25
MambaLiteUNet integrates Mamba into U-Net with adaptive fusion, local-global mixing, and cross-gated attention modules to reach 87.12% IoU and 93.09% Dice on skin lesion datasets while cutting parameters by 93.6%.
HiFi-Mamba: Dual-Stream W-Laplacian Enhanced Mamba for High-Fidelity MRI Reconstruction eess.IV · 2025-08-07 · unverdicted · none · ref 22
HiFi-Mamba uses stacked W-Laplacian spectral decoupling and unidirectional HiFi-Mamba blocks to improve high-frequency detail preservation and efficiency over prior Mamba, CNN, and Transformer models for MRI reconstruction.

Mamba in vision: A comprehensive survey of techniques and applications

fields

years

verdicts

representative citing papers

citing papers explorer