pith. sign in

Quantspec: Self-speculative decoding with hierarchical quantized kv cache,

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.AR 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

Cassandra: Enabling Reasoning LLMs at Edge via Self-Speculative Decoding

cs.AR · 2026-05-26 · unverdicted · novelty 5.0

Cassandra is a self-speculative decoding system that builds a draft model via fine-grained data selection and optimized pruning/mantissa truncation, achieving up to 2.41x speedup over BF16 and 1.81x more tokens than Eagle-3 on Llama 3 8B without training.

citing papers explorer

Showing 1 of 1 citing paper.

  • Cassandra: Enabling Reasoning LLMs at Edge via Self-Speculative Decoding cs.AR · 2026-05-26 · unverdicted · none · ref 58

    Cassandra is a self-speculative decoding system that builds a draft model via fine-grained data selection and optimized pruning/mantissa truncation, achieving up to 2.41x speedup over BF16 and 1.81x more tokens than Eagle-3 on Llama 3 8B without training.