Angelslim: A more accessible, comprehensive, and efficient toolkit for large model compression

AngelSlim: A more accessible, comprehensive, efficient toolkit for large model compression , author= · 2026 · arXiv 2602.21233

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

Bastion: Budget-Aware Speculative Decoding with Tree-structured Block Diffusion Drafting

cs.LG · 2026-05-28 · unverdicted · novelty 7.0

BASTION is a budget-aware speculative decoding framework with adaptive tree-structured block diffusion drafting that reports up to 6.61x speedup and 39% improvement over block-diffusion baselines.

DFlare: Scaling Up Draft Capacity for Block Diffusion Speculative Decoding

cs.CL · 2026-06-01 · unverdicted · novelty 6.0

DFlare replaces DFlash's shared fused representation with per-draft-layer attention to distinct target-layer combinations, enabling deeper drafts and 2.4M training samples for 5-11% higher speedups than DFlash on Qwen3 and GPT-OSS models.

Hy-MT2: A Family of Fast, Efficient and Powerful Multilingual Translation Models in the Wild

cs.CL · 2026-05-21 · unverdicted · novelty 3.0 · 2 refs

Hy-MT2 presents three new multilingual translation models that claim to outperform listed open-source and commercial systems on diverse tasks while enabling low-storage on-device use.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Bastion: Budget-Aware Speculative Decoding with Tree-structured Block Diffusion Drafting cs.LG · 2026-05-28 · unverdicted · none · ref 56
BASTION is a budget-aware speculative decoding framework with adaptive tree-structured block diffusion drafting that reports up to 6.61x speedup and 39% improvement over block-diffusion baselines.

Angelslim: A more accessible, comprehensive, and efficient toolkit for large model compression

fields

years

verdicts

representative citing papers

citing papers explorer