Blackmamba: Mixture of experts for state-space models.arXiv preprint arXiv:2402.01771

· 2024 · arXiv 2402.01771

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Hidden State Poisoning Attacks against Mamba-based Language Models

cs.CL · 2026-01-05 · unverdicted · novelty 7.0

Short input phrases can irreversibly overwrite hidden states in Mamba models, impairing information retrieval on a new benchmark while leaving pure Transformer models unaffected.

ZAYA1-8B Technical Report

cs.AI · 2026-05-06 · unverdicted · novelty 6.0

ZAYA1-8B is a reasoning MoE model with 700M active parameters that matches larger models on math and coding benchmarks and reaches 91.9% on AIME'25 via Markovian RSA test-time compute.

Efficient RWKV-based Representation Learning for 3D Point Clouds

cs.CV · 2026-06-09 · unverdicted · novelty 5.0

Introduces P-RWKV block and PointER self-supervised framework to adapt RWKV for efficient 3D point cloud representation learning.

ZONOS2 Technical Report

cs.SD · 2026-06-23 · unverdicted · novelty 4.0

ZONOS2 8B is a scaled MoE TTS model with 900M active parameters trained on 6M hours of data that reports competitive SOTA results on naturalness, speaker similarity, WER, and a new ZTTS1-Eval benchmark while releasing weights and code.

A Survey on Efficient Inference for Large Language Models

cs.CL · 2024-04-22 · accept · novelty 3.0

The paper surveys techniques to speed up and reduce the resource needs of LLM inference, organized by data-level, model-level, and system-level changes, with comparative experiments on representative methods.

A Survey of Mamba

cs.LG · 2024-08-02 · unverdicted · novelty 2.0

The paper consolidates existing research on Mamba models, their architecture variants, adaptations to different data modalities, and applications across domains.

citing papers explorer

Showing 1 of 1 citing paper after filters.

ZAYA1-8B Technical Report cs.AI · 2026-05-06 · unverdicted · none · ref 21
ZAYA1-8B is a reasoning MoE model with 700M active parameters that matches larger models on math and coding benchmarks and reaches 91.9% on AIME'25 via Markovian RSA test-time compute.

Blackmamba: Mixture of experts for state-space models.arXiv preprint arXiv:2402.01771

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer