FASER delivers up to 53% higher throughput and 1.92x lower latency in dynamic LLM serving by adjusting speculative lengths per request, early pruning of rejects, and overlapping draft/verification phases via frontiers.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
method 1polarities
use method 1representative citing papers
A 1D token interface with Selective Token Editing improves multimodal image fusion by modeling global appearance factors separately from local 2D structures, yielding best overall performance on four benchmarks.
citing papers explorer
-
FASER: Fine-Grained Phase Management for Speculative Decoding in Dynamic LLM Serving
FASER delivers up to 53% higher throughput and 1.92x lower latency in dynamic LLM serving by adjusting speculative lengths per request, early pruning of rejects, and overlapping draft/verification phases via frontiers.
-
From 2D Grids to 1D Tokens: Reforming Shared Representations for Multimodal Image Fusion
A 1D token interface with Selective Token Editing improves multimodal image fusion by modeling global appearance factors separately from local 2D structures, yielding best overall performance on four benchmarks.