Can long-context lan- guage models subsume retrieval, rag, sql, and more?

Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M · 2024 · arXiv 2406.13121

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

read on arXiv browse 8 citing papers

representative citing papers

RULER: What's the Real Context Size of Your Long-Context Language Models?

cs.CL · 2024-04-09 · accept · novelty 8.0

RULER shows most long-context LMs drop sharply in performance on complex tasks as length and difficulty increase, with only half maintaining results at 32K tokens.

Model-Native Computing Architecture: Envisioning Future System Architecture Through the Lens of Computer Architecture

cs.AI · 2026-05-29 · unverdicted · novelty 7.0

Proposes the Intelligent Computing Architecture (ICA) as a six-layer framework with dual probabilistic-deterministic planes and three Amdahl-style heuristics to unify design of LLM-based systems.

Scalable Model-Based Clustering with Sequential Monte Carlo

stat.ML · 2026-04-16 · unverdicted · novelty 7.0

A memory-efficient SMC clustering method decomposes problems into approximately independent subproblems to handle large-scale online clustering with complex distributions.

ATLAS: All-round Testing of Long-context Abilities across Scales

cs.CL · 2026-05-27 · unverdicted · novelty 5.0

ATLAS is a length-dependent benchmarking framework that evaluates 26 models on 8 capability dimensions and shows substantial rank changes when moving from 128K to 1M token ranges.

Inference Time Context Sparsity: Illusion or Opportunity?

cs.AI · 2026-05-22 · unverdicted · novelty 5.0

Current LLMs remain robust to high levels of inference-time context sparsity across diverse tasks, enabling up to 10x acceleration via sparse kernels.

LLM as Attention-Informed NTM and Topic Modeling as long-input Generation: Interpretability and long-Context Capability

cs.CL · 2025-10-03 · unverdicted · novelty 5.0

LLMs recover interpretable topic structures via attention and achieve competitive topic modeling performance as long-context generators.

World Model on Million-Length Video And Language With Blockwise RingAttention

cs.LG · 2024-02-13 · unverdicted · novelty 5.0

Presents open-source 7B models for million-token video and language understanding via Blockwise RingAttention, setting new benchmarks in retrieval and long video tasks.

Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

cs.CL · 2025-07-07 · unverdicted · novelty 4.0

Gemini 2.5 Pro and Flash models are presented as achieving frontier performance in reasoning, coding, and long-context multimodal tasks while spanning a cost-capability Pareto curve.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Model-Native Computing Architecture: Envisioning Future System Architecture Through the Lens of Computer Architecture cs.AI · 2026-05-29 · unverdicted · none · ref 80
Proposes the Intelligent Computing Architecture (ICA) as a six-layer framework with dual probabilistic-deterministic planes and three Amdahl-style heuristics to unify design of LLM-based systems.
Inference Time Context Sparsity: Illusion or Opportunity? cs.AI · 2026-05-22 · unverdicted · none · ref 25
Current LLMs remain robust to high levels of inference-time context sparsity across diverse tasks, enabling up to 10x acceleration via sparse kernels.

Can long-context lan- guage models subsume retrieval, rag, sql, and more?

fields

years

verdicts

representative citing papers

citing papers explorer