DREAM-S combines neural architecture search, target-aware supernet training, and attention-entropy-guided distillation to accelerate speculative decoding in VLMs, reporting up to 3.85x speedup over standard methods.
Speculative decoding and beyond: An in-depth survey of techniques,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
RISE introduces a training-free relay inference mechanism for diffusion models across edge and device plus a contextual bandit scheduler, reporting up to 2.1x speedup with preserved quality on two benchmarks.
citing papers explorer
-
DREAM-S: Speculative Decoding with Searchable Drafting and Target-Aware Refinement for Multimodal Generation
DREAM-S combines neural architecture search, target-aware supernet training, and attention-entropy-guided distillation to accelerate speculative decoding in VLMs, reporting up to 3.85x speedup over standard methods.
-
RISE: Relay Inference and Online Scheduling for Efficient Edge-Device Collaborative Diffusion Model Services
RISE introduces a training-free relay inference mechanism for diffusion models across edge and device plus a contextual bandit scheduler, reporting up to 2.1x speedup with preserved quality on two benchmarks.