Unlocking efficiency in large language model inference: A comprehensive survey of speculative decoding

Heming Xia, Zhe Yang, Qingxiu Dong, Peiyi Wang, Yongqi Li, Tao Ge, Tianyu Liu, Wenjie Li, Zhifang Sui · 2024 · DOI 10.18653/v1/2024.findings-acl.456

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding

cs.DC · 2026-02-10 · unverdicted · novelty 6.0

SPEED-Bench is a new standardized benchmark for speculative decoding that supplies semantically diverse qualitative data and throughput-oriented splits across concurrency levels, integrated with vLLM and TensorRT-LLM.

LogitSpec: Accelerating Retrieval-based Speculative Decoding via Next Next Token Speculation

cs.CL · 2025-07-02 · unverdicted · novelty 5.0

LogitSpec accelerates retrieval-based speculative decoding by speculating the next-next token from the last logit and retrieving relevant references for both next and next-next tokens, reporting up to 2.61x speedup and 3.28 mean accepted tokens.

citing papers explorer

Showing 2 of 2 citing papers.

SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding cs.DC · 2026-02-10 · unverdicted · none · ref 50
SPEED-Bench is a new standardized benchmark for speculative decoding that supplies semantically diverse qualitative data and throughput-oriented splits across concurrency levels, integrated with vLLM and TensorRT-LLM.
LogitSpec: Accelerating Retrieval-based Speculative Decoding via Next Next Token Speculation cs.CL · 2025-07-02 · unverdicted · none · ref 25
LogitSpec accelerates retrieval-based speculative decoding by speculating the next-next token from the last logit and retrieving relevant references for both next and next-next tokens, reporting up to 2.61x speedup and 3.28 mean accepted tokens.

Unlocking efficiency in large language model inference: A comprehensive survey of speculative decoding

fields

years

verdicts

representative citing papers

citing papers explorer