BASTION is a budget-aware speculative decoding framework with adaptive tree-structured block diffusion drafting that reports up to 6.61x speedup and 39% improvement over block-diffusion baselines.
Angelslim: A more accessible, comprehensive, and efficient toolkit for large model compression
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
citing papers explorer
-
Bastion: Budget-Aware Speculative Decoding with Tree-structured Block Diffusion Drafting
BASTION is a budget-aware speculative decoding framework with adaptive tree-structured block diffusion drafting that reports up to 6.61x speedup and 39% improvement over block-diffusion baselines.
- Hy-MT2: A Family of Fast, Efficient and Powerful Multilingual Translation Models in the Wild