DASH discovers stronger hybrid attention architectures for LLMs via minutes-scale differentiable search, outperforming selector baselines and Jet-Nemotron on RULER while using 0.006% of prior search tokens.
CoRR , volume =
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
AgentRevive is a Markov state-aware framework with policy learning and edge optimization that manages Active, Standby, and Terminated agent states to enable resilient multi-agent evolution and reduce token consumption.
citing papers explorer
-
DASH: Fast Differentiable Architecture Search for Hybrid Attention in Minutes on a Single GPU
DASH discovers stronger hybrid attention architectures for LLMs via minutes-scale differentiable search, outperforming selector baselines and Jet-Nemotron on RULER while using 0.006% of prior search tokens.
-
Taming "Zombie'' Agents: A Markov State-Aware Framework for Resilient Multi-Agent Evolution
AgentRevive is a Markov state-aware framework with policy learning and edge optimization that manages Active, Standby, and Terminated agent states to enable resilient multi-agent evolution and reduce token consumption.