Context as a tool: Context management for long-horizon swe-agents.arXiv preprint arXiv:2512.22087

· 2025 · arXiv 2512.22087

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

The Time is Here for Just-in-Time Systems: Challenges and Opportunities

cs.DB · 2026-05-22 · unverdicted · novelty 7.0

Jitskit is an iterative LLM-based synthesis pipeline that generates key-value stores matching spec cards for YCSB workloads, resources, and properties, outperforming SOTA baselines on all 18 tested cases by up to 4.6x.

Remember Your Trace: Memory-Guided Long-Horizon Agentic Framework for Consistent and Hierarchical Repository-Level Code Documentation

cs.SE · 2026-05-14 · unverdicted · novelty 7.0

MemDocAgent generates consistent hierarchical repository-level code documentation by combining dependency-aware traversal with memory-guided agent interactions that accumulate work traces.

LEAD: Breaking the No-Recovery Bottleneck in Long-Horizon Reasoning

cs.AI · 2026-03-06 · unverdicted · novelty 7.0

LEAD lets LLMs solve checkers jumping puzzles up to size 13 by using lookahead to recover from irreversible errors on hard steps that break extreme decomposition.

LLM Agents Are Latent Context Managers: Eliciting Self-Managed Context via a Proprioceptive Dashboard

cs.CL · 2026-06-29 · unverdicted · novelty 6.0

VISTA supplies LLM agents with a visible proprioceptive dashboard of typed context blocks, enabling untrained self-management that lifts performance on long-horizon tool-use benchmarks across multiple model scales.

SAM: State-Adaptive Memory for Long-Horizon Reasoning Agent

cs.AI · 2026-05-23 · unverdicted · novelty 6.0

SAM is a standalone memory framework for long-horizon LLM agents that creates state-adaptive cues from interactions, preserves raw trajectories for intent-driven recall, and optimizes the module via expert supervision and RL, outperforming baselines on BrowseComp and related benchmarks.

GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization (V1.0)

cs.CL · 2026-04-18 · unverdicted · novelty 6.0

GenericAgent outperforms other LLM agents on long-horizon tasks by maximizing context information density with fewer tokens via minimal tools, on-demand memory, trajectory-to-SOP evolution, and compression.

SWE-MeM: Learning Adaptive Memory Management for Long-Horizon Coding Agents

cs.SE · 2026-06-26 · unverdicted · novelty 5.0

SWE-MeM introduces adaptive memory management for coding agents via synthesized trajectories and Memory-aware GRPO, reporting 43.4% and 60.2% resolve rates on SWE-Bench Verified for 4B and 30B models while beating baselines on performance and token use.

On Training Large Language Models for Long-Horizon Tasks: An Empirical Study of Horizon Length

cs.AI · 2026-05-04 · unverdicted · novelty 5.0

Longer action horizons bottleneck LLM agent training through instability, but training with reduced horizons stabilizes learning and enables better generalization to longer horizons.

From Question Answering to Task Completion: A Survey on Agent System and Harness Design

cs.AI · 2026-06-14 · unverdicted · novelty 4.0

Survey framing LLM agents as model-plus-harness systems, decomposing harness responsibilities, mapping them to tasks, and highlighting open challenges in evaluation, safety, and co-evolution.

Less Context, Better Agents: Efficient Context Engineering for Long-Horizon Tool-Using LLM Agents

cs.AI · 2026-06-08 · unverdicted · novelty 4.0

On a hotel expense benchmark, pruning LLM agent context to the last 5 tool pairs plus summarization raises completion to 91.6% and cuts tokens by ~63% compared with retaining full conversation history.

citing papers explorer

Showing 10 of 10 citing papers.

The Time is Here for Just-in-Time Systems: Challenges and Opportunities cs.DB · 2026-05-22 · unverdicted · none · ref 31
Jitskit is an iterative LLM-based synthesis pipeline that generates key-value stores matching spec cards for YCSB workloads, resources, and properties, outperforming SOTA baselines on all 18 tested cases by up to 4.6x.
Remember Your Trace: Memory-Guided Long-Horizon Agentic Framework for Consistent and Hierarchical Repository-Level Code Documentation cs.SE · 2026-05-14 · unverdicted · none · ref 26
MemDocAgent generates consistent hierarchical repository-level code documentation by combining dependency-aware traversal with memory-guided agent interactions that accumulate work traces.
LEAD: Breaking the No-Recovery Bottleneck in Long-Horizon Reasoning cs.AI · 2026-03-06 · unverdicted · none · ref 4
LEAD lets LLMs solve checkers jumping puzzles up to size 13 by using lookahead to recover from irreversible errors on hard steps that break extreme decomposition.
LLM Agents Are Latent Context Managers: Eliciting Self-Managed Context via a Proprioceptive Dashboard cs.CL · 2026-06-29 · unverdicted · none · ref 18
VISTA supplies LLM agents with a visible proprioceptive dashboard of typed context blocks, enabling untrained self-management that lifts performance on long-horizon tool-use benchmarks across multiple model scales.
SAM: State-Adaptive Memory for Long-Horizon Reasoning Agent cs.AI · 2026-05-23 · unverdicted · none · ref 18
SAM is a standalone memory framework for long-horizon LLM agents that creates state-adaptive cues from interactions, preserves raw trajectories for intent-driven recall, and optimizes the module via expert supervision and RL, outperforming baselines on BrowseComp and related benchmarks.
GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization (V1.0) cs.CL · 2026-04-18 · unverdicted · none · ref 5
GenericAgent outperforms other LLM agents on long-horizon tasks by maximizing context information density with fewer tokens via minimal tools, on-demand memory, trajectory-to-SOP evolution, and compression.
SWE-MeM: Learning Adaptive Memory Management for Long-Horizon Coding Agents cs.SE · 2026-06-26 · unverdicted · none · ref 25
SWE-MeM introduces adaptive memory management for coding agents via synthesized trajectories and Memory-aware GRPO, reporting 43.4% and 60.2% resolve rates on SWE-Bench Verified for 4B and 30B models while beating baselines on performance and token use.
On Training Large Language Models for Long-Horizon Tasks: An Empirical Study of Horizon Length cs.AI · 2026-05-04 · unverdicted · none · ref 62
Longer action horizons bottleneck LLM agent training through instability, but training with reduced horizons stabilizes learning and enables better generalization to longer horizons.
From Question Answering to Task Completion: A Survey on Agent System and Harness Design cs.AI · 2026-06-14 · unverdicted · none · ref 128
Survey framing LLM agents as model-plus-harness systems, decomposing harness responsibilities, mapping them to tasks, and highlighting open challenges in evaluation, safety, and co-evolution.
Less Context, Better Agents: Efficient Context Engineering for Long-Horizon Tool-Using LLM Agents cs.AI · 2026-06-08 · unverdicted · none · ref 10
On a hotel expense benchmark, pruning LLM agent context to the last 5 tool pairs plus summarization raises completion to 91.6% and cuts tokens by ~63% compared with retaining full conversation history.

Context as a tool: Context management for long-horizon swe-agents.arXiv preprint arXiv:2512.22087

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer