pith. sign in

Dart: Diffusion-inspired speculative decoding for fast llm inference

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

citation-role summary

background 3

citation-polarity summary

fields

cs.CL 7 cs.LG 2

years

2026 9

verdicts

UNVERDICTED 9

roles

background 3

polarities

background 3

clear filters

representative citing papers

Cost-Aware Diffusion Draft Trees for Speculative Decoding

cs.CL · 2026-06-01 · unverdicted · novelty 7.0

CaDDTree jointly selects tree structure and budget to maximize expected tokens per unit time in speculative decoding, proving unimodality under convex verification cost and matching oracle DDTree performance on Qwen models.

SpecBlock: Block-Iterative Speculative Decoding with Dynamic Tree Drafting

cs.CL · 2026-05-08 · unverdicted · novelty 7.0 · 2 refs

SpecBlock achieves 8-13% higher mean speedup than EAGLE-3 at 44-52% drafting cost via block-iterative drafting with hidden-state inheritance, dynamic rank-head branching, valid-prefix masking, and optional cost-aware bandit adaptation.

Draft-OPD: On-Policy Distillation for Speculative Draft Models

cs.CL · 2026-05-28 · unverdicted · novelty 6.0

Draft-OPD applies on-policy distillation via target-assisted generation and error replay to train speculative draft models, yielding over 5x lossless acceleration and gains over EAGLE-3 and DFlash.

Accelerating Speculative Decoding with Block Diffusion Draft Trees

cs.CL · 2026-04-14 · unverdicted · novelty 6.0

DDTree builds a draft tree from a block diffusion drafter using a best-first heap on its output probabilities and verifies the tree in one target-model pass via an ancestor-only attention mask, increasing average accepted tokens per round.

citing papers explorer

Showing 9 of 9 citing papers after filters.