The complexity trap: Simple observation masking is as efficient as LLM summarization for agent context management

Tobias Lindenbauer, Igor Slinko, Ludwig Felder, Egor Bogomolov, Yaroslav Zharov · 2025 · arXiv 2508.21433

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

AGORA: Adapter-Grounded Observation-Action Retention for Inference-Free Prompt Compression in LLM Agents

cs.AI · 2026-05-26 · unverdicted · novelty 7.0

AGORA is an inference-free step-level compressor for LLM agent prompts that retains at least 75% of uncompressed performance in most tested settings where token-level methods collapse due to action-grammar destruction.

ContextEcho: A Benchmark for Persona Drift in Long Agentic-Coding Sessions

cs.CL · 2026-05-22 · unverdicted · novelty 7.0

ContextEcho benchmark shows persona drift occurs across 23 frontier models in long agentic-coding sessions, is not reliably reset by compaction, and can be restored by single-shot anchors with mode-dependent effects.

Scaling Test-Time Compute for Agentic Coding

cs.SE · 2026-04-16 · unverdicted · novelty 7.0

Structured summaries of agent trajectories enable Recursive Tournament Voting and adapted Parallel-Distill-Refine to scale test-time compute, improving frontier coding agents on SWE-Bench Verified and Terminal-Bench.

MatClaw: An Autonomous Code-First LLM Agent for End-to-End Materials Exploration

cond-mat.mtrl-sci · 2026-04-03 · conditional · novelty 7.0 · 2 refs

MatClaw shows a code-first LLM agent autonomously generating and executing workflows for ML force field training, Curie temperature prediction, and parameter search on CuInP2S6, succeeding on code but requiring interventions for tacit domain knowledge.

Context Pruning for Coding Agents via Multi-Rubric Latent Reasoning

cs.AI · 2026-05-14 · unverdicted · novelty 5.0

LaMR decomposes code context pruning into two rubrics using dedicated CRFs, a mixture-of-experts gate, and AST-derived labels to filter noise and often match or beat full-context baselines on coding benchmarks.

Reducing Token Usage of State-in-Context Agents using Minification

cs.SE · 2026-05-31 · unverdicted · novelty 3.0

Code minification reduces average input token usage by 42% in state-in-context agents with a 12 percentage point drop in resolution rate on SWE-bench Verified.

citing papers explorer

Showing 6 of 6 citing papers.

AGORA: Adapter-Grounded Observation-Action Retention for Inference-Free Prompt Compression in LLM Agents cs.AI · 2026-05-26 · unverdicted · none · ref 17
AGORA is an inference-free step-level compressor for LLM agent prompts that retains at least 75% of uncompressed performance in most tested settings where token-level methods collapse due to action-grammar destruction.
ContextEcho: A Benchmark for Persona Drift in Long Agentic-Coding Sessions cs.CL · 2026-05-22 · unverdicted · none · ref 47
ContextEcho benchmark shows persona drift occurs across 23 frontier models in long agentic-coding sessions, is not reliably reset by compaction, and can be restored by single-shot anchors with mode-dependent effects.
Scaling Test-Time Compute for Agentic Coding cs.SE · 2026-04-16 · unverdicted · none · ref 1
Structured summaries of agent trajectories enable Recursive Tournament Voting and adapted Parallel-Distill-Refine to scale test-time compute, improving frontier coding agents on SWE-Bench Verified and Terminal-Bench.
MatClaw: An Autonomous Code-First LLM Agent for End-to-End Materials Exploration cond-mat.mtrl-sci · 2026-04-03 · conditional · none · ref 11 · 2 links
MatClaw shows a code-first LLM agent autonomously generating and executing workflows for ML force field training, Curie temperature prediction, and parameter search on CuInP2S6, succeeding on code but requiring interventions for tacit domain knowledge.
Context Pruning for Coding Agents via Multi-Rubric Latent Reasoning cs.AI · 2026-05-14 · unverdicted · none · ref 34
LaMR decomposes code context pruning into two rubrics using dedicated CRFs, a mixture-of-experts gate, and AST-derived labels to filter noise and often match or beat full-context baselines on coding benchmarks.
Reducing Token Usage of State-in-Context Agents using Minification cs.SE · 2026-05-31 · unverdicted · none · ref 7
Code minification reduces average input token usage by 42% in state-in-context agents with a 12 percentage point drop in resolution rate on SWE-bench Verified.

The complexity trap: Simple observation masking is as efficient as LLM summarization for agent context management

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer