Quamba2: Robust and efficient post-training quantization for selective state space models.arXiv preprint arXiv:2503.22879, 2025

Hung-Yi Chiang, Hung-Yueh Guo, Zhewei Chang, Andreas Gerstlauer, Diana Ding · 2025 · arXiv 2503.22879

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Ternary Mamba: Grouped Quantization-Aware Training of W1.58A16 State Space Models

cs.LG · 2026-06-16 · unverdicted · novelty 6.0

Ternary Mamba-2 1.3B models reach 48.1% zero-shot accuracy via QAT from pretrained checkpoints in 102M tokens, close to Bi-Mamba, with 3.61x compression.

MOSAIC: Efficient Mixture-of-Agent Scheduling via Adaptive Aggregation and Inference Concurrency

cs.LG · 2026-06-02 · unverdicted · novelty 5.0

MOSAIC uses an Integer Linear Program scheduler for expert placement and prompt assignment plus adaptive aggregation to achieve 1.7-2.3x end-to-end speedup on 4-GPU MoA workloads while keeping accuracy within 0.1pp.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Ternary Mamba: Grouped Quantization-Aware Training of W1.58A16 State Space Models cs.LG · 2026-06-16 · unverdicted · none · ref 2
Ternary Mamba-2 1.3B models reach 48.1% zero-shot accuracy via QAT from pretrained checkpoints in 102M tokens, close to Bi-Mamba, with 3.61x compression.
MOSAIC: Efficient Mixture-of-Agent Scheduling via Adaptive Aggregation and Inference Concurrency cs.LG · 2026-06-02 · unverdicted · none · ref 6
MOSAIC uses an Integer Linear Program scheduler for expert placement and prompt assignment plus adaptive aggregation to achieve 1.7-2.3x end-to-end speedup on 4-GPU MoA workloads while keeping accuracy within 0.1pp.

Quamba2: Robust and efficient post-training quantization for selective state space models.arXiv preprint arXiv:2503.22879, 2025

fields

years

verdicts

representative citing papers

citing papers explorer