Evaluating LLMLingua-2 at 2x compression on LLaDA shows non-uniform transfer to diffusion LLMs, with mathematical reasoning degrading substantially despite high BERTScore while summarization remains more robust.
Longllmlingua: Accelerating and enhancing llms in long context sce- narios via prompt compression,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Introduces Efficiency Frontier framework for deployment-aware cost-performance optimization of LLM context strategies, reporting ~25% token reduction at F1≈0.78 on 5,000 HotpotQA instances.
citing papers explorer
-
Prompt Compression in Diffusion Large Language Models: Evaluating LLMLingua-2 on LLaDA
Evaluating LLMLingua-2 at 2x compression on LLaDA shows non-uniform transfer to diffusion LLMs, with mathematical reasoning degrading substantially despite high BERTScore while summarization remains more robust.
-
The Efficiency Frontier: A Unified Framework for Cost-Performance Optimization in LLM Context Management
Introduces Efficiency Frontier framework for deployment-aware cost-performance optimization of LLM context strategies, reporting ~25% token reduction at F1≈0.78 on 5,000 HotpotQA instances.