Radev , editor =

Ming Zhong, Da Yin, Tao Yu, Ahmad Zaidi, Mutethia Mutuma, Rahul Jha, Ahmed Hassan Awadallah, Asli Celikyilmaz, Yang Liu, Xipeng Qiu, Dragomir Radev · 2021 · DOI 10.18653/v1/2021.naacl-main.472

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

open at publisher browse 7 citing papers

representative citing papers

Lynx: Progressive Speculative Quantization for accelerating KV Transfer in Long-Context Inference

cs.DC · 2026-07-02 · unverdicted · novelty 7.0

Lynx partitions KV cache bits into anchor and residual streams for progressive transfer, enabling speculative decoding on partial data followed by verification to match BF16 accuracy at 4-bit-like TTFT.

TwinRouterBench: Fast Static and Live Dynamic Evaluation for Realistic Agentic LLM Routing

cs.LG · 2026-05-14 · accept · novelty 7.0 · 2 refs

TwinRouterBench supplies 970 execution-verified router prefixes across five datasets plus a live harness for 100 held-out SWE-bench cases, scoring routers on tier accuracy, trajectory success, and realized token cost without LLM judges.

Less is More: Quality-Aware Training Data Selection for Scientific Summarization

cs.CL · 2026-06-23 · unverdicted · novelty 6.0

A 1.88-million-article biomedical summarization dataset is released and quality-aware selection of training data based on abstract alignment outperforms random sampling on factuality metrics.

Interdomain Attention: Beyond Token-Level Key-Value Memory

cs.LG · 2026-05-23 · unverdicted · novelty 6.0

Interdomain Attention integrates SSMs into attention via finite feature maps and basis projections to enable query-conditioned attention over fixed states, showing gains over SSM baselines and matching softmax at 1.3B scale with length-flat scaling.

Predictive Prefetching for Retrieval-Augmented Generation

cs.CL · 2026-05-18 · unverdicted · novelty 6.0

Introduces predictive prefetching for RAG that anticipates retrieval needs several tokens ahead via three components, reporting up to 43.5% latency reduction and 62.4% TTFT improvement while preserving answer quality.

Topic-to-Timestamp Alignment by Constrained Evidence Selection

cs.CL · 2026-06-18 · unverdicted · novelty 5.0

Constrained candidate selection from retrieved chunks raises Recall@5 from 31.9% to 50.0% and parseable outputs on 420 queries from 200 municipal meeting transcripts.

Gated Delta Networks: Improving Mamba2 with Delta Rule

cs.CL · 2024-12-09 · unverdicted · novelty 5.0

Gated DeltaNet integrates gating and delta rules into linear transformers, outperforming Mamba2 and DeltaNet on language modeling, reasoning, retrieval, and long-context tasks.

citing papers explorer

Showing 6 of 6 citing papers after filters.

Lynx: Progressive Speculative Quantization for accelerating KV Transfer in Long-Context Inference cs.DC · 2026-07-02 · unverdicted · none · ref 65
Lynx partitions KV cache bits into anchor and residual streams for progressive transfer, enabling speculative decoding on partial data followed by verification to match BF16 accuracy at 4-bit-like TTFT.
Less is More: Quality-Aware Training Data Selection for Scientific Summarization cs.CL · 2026-06-23 · unverdicted · none · ref 21
A 1.88-million-article biomedical summarization dataset is released and quality-aware selection of training data based on abstract alignment outperforms random sampling on factuality metrics.
Interdomain Attention: Beyond Token-Level Key-Value Memory cs.LG · 2026-05-23 · unverdicted · none · ref 5
Interdomain Attention integrates SSMs into attention via finite feature maps and basis projections to enable query-conditioned attention over fixed states, showing gains over SSM baselines and matching softmax at 1.3B scale with length-flat scaling.
Predictive Prefetching for Retrieval-Augmented Generation cs.CL · 2026-05-18 · unverdicted · none · ref 33
Introduces predictive prefetching for RAG that anticipates retrieval needs several tokens ahead via three components, reporting up to 43.5% latency reduction and 62.4% TTFT improvement while preserving answer quality.
Topic-to-Timestamp Alignment by Constrained Evidence Selection cs.CL · 2026-06-18 · unverdicted · none · ref 12
Constrained candidate selection from retrieved chunks raises Recall@5 from 31.9% to 50.0% and parseable outputs on 420 queries from 200 municipal meeting transcripts.
Gated Delta Networks: Improving Mamba2 with Delta Rule cs.CL · 2024-12-09 · unverdicted · none · ref 65
Gated DeltaNet integrates gating and delta rules into linear transformers, outperforming Mamba2 and DeltaNet on language modeling, reasoning, retrieval, and long-context tasks.

Radev , editor =

fields

years

verdicts

representative citing papers

citing papers explorer