ViM-Q delivers 4.96x speedup and 59.8x energy efficiency for Vision Mamba inference on FPGA versus a quantized GPU baseline using dynamic activation quantization, per-block APoT weights, and a pipelined SSM engine.
Swin transformer: Hierarchical vision transformer using shifted windows,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.AR 2verdicts
UNVERDICTED 2representative citing papers
Hardware accelerator for vision transformers using dynamic token pruning, ReLU replacement, FFN pruning, and row-wise dataflow to reach 2.31 TOPS/W in 28nm CMOS with under 2% accuracy loss.
citing papers explorer
-
ViM-Q: Scalable Algorithm-Hardware Co-Design for Vision Mamba Model Inference on FPGA
ViM-Q delivers 4.96x speedup and 59.8x energy efficiency for Vision Mamba inference on FPGA versus a quantized GPU baseline using dynamic activation quantization, per-block APoT weights, and a pipelined SSM engine.
-
Low Power Vision Transformer Accelerator with Hardware-Aware Pruning and Optimized Dataflow
Hardware accelerator for vision transformers using dynamic token pruning, ReLU replacement, FFN pruning, and row-wise dataflow to reach 2.31 TOPS/W in 28nm CMOS with under 2% accuracy loss.