Warrant adds learned permission gates g_ij to attention value terms alpha_ij * v_j, improving primary metrics in 27 of 32 comparisons across CTDG, MTPP, RAG, STPP, and TKG tasks.
Gated-attentionreadersfortextcomprehension,in:Proceedingsofthe 55th Annual Meeting of the Association for Computational Linguistics, pp
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.AI 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Low-bit post-training quantization of reasoning LLMs increases reasoning token counts while preserving accuracy, introducing a hidden test-time compute cost.
citing papers explorer
-
Relevance Is Not Permission: Warranted Attention for Value Contributions
Warrant adds learned permission gates g_ij to attention value terms alpha_ij * v_j, improving primary metrics in 27 of 32 comparisons across CTDG, MTPP, RAG, STPP, and TKG tasks.
-
Quantization Inflates Reasoning: Token Inflation as a Hidden Cost of Low-Bit Reasoning Models
Low-bit post-training quantization of reasoning LLMs increases reasoning token counts while preserving accuracy, introducing a hidden test-time compute cost.