Star Elastic trains N nested submodels in a single post-training job on a parent reasoning LLM, supporting elastic budget control that matches or exceeds independent baselines while cutting training compute by up to 360x.
Matformer: Nested transformer for elastic inference
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
GRINQH introduces a graded input-based quantization hierarchy that dynamically assigns multi-precision weights using activation magnitudes as importance proxy, unifying quantization with sparsification to improve LLM decoding speed and quality trade-offs on Llama3 and Qwen3 models.
Small language models are sufficiently capable, more suitable, and far more economical than large models for the repetitive tasks that dominate agentic AI systems.
citing papers explorer
-
Star Elastic: Many-in-One Reasoning LLMs with Efficient Budget Control
Star Elastic trains N nested submodels in a single post-training job on a parent reasoning LLM, supporting elastic budget control that matches or exceeds independent baselines while cutting training compute by up to 360x.
-
GRINQH: Graded Input-based Quantization Hierarchy for Efficient LLM Generation
GRINQH introduces a graded input-based quantization hierarchy that dynamically assigns multi-precision weights using activation magnitudes as importance proxy, unifying quantization with sparsification to improve LLM decoding speed and quality trade-offs on Llama3 and Qwen3 models.
-
Small Language Models are the Future of Agentic AI
Small language models are sufficiently capable, more suitable, and far more economical than large models for the repetitive tasks that dominate agentic AI systems.