SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents

Yuhang Wang , Yuling Shi , Mo Yang , Rongrui Zhang , Shilin He , Heng Lian , Yuting Chen , Siyu Ye

show 2 more authors

Kai Cai Xiaodong Gu

Authors on Pith no claims yet

classification 💻 cs.SE cs.CL

keywords contextpruningswe-pruneragentsagentcodecodingcompression

0 comments

read the original abstract

LLM agents have demonstrated remarkable capabilities in software development, but their performance is hampered by long interaction contexts, which incur high API costs and latency. While various context compression approaches such as LongLLMLingua have emerged to tackle this challenge, they typically rely on fixed metrics such as PPL, ignoring the task-specific nature of code understanding. As a result, they frequently disrupt syntactic and logical structure and fail to retain critical implementation details. In this paper, we propose SWE-Pruner, a self-adaptive context pruning framework tailored for coding agents. Drawing inspiration from how human programmers "selectively skim" source code during development and debugging, SWE-Pruner performs task-aware adaptive pruning for long contexts. Given the current task, the agent formulates an explicit goal (e.g., "focus on error handling") as a hint to guide the pruning targets. A lightweight neural skimmer (0.6B parameters) is trained to dynamically select relevant lines from the surrounding context given the goal. Evaluations across four benchmarks and multiple models validate SWE-Pruner's effectiveness in various scenarios, achieving 23-54% token reduction on agent tasks like SWE-Bench Verified while even improving success rates, and up to 14.84x compression on single-turn tasks like LongCodeQA with minimal performance impact.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ClassEval-Pro: A Cross-Domain Benchmark for Class-Level Code Generation
cs.SE 2026-04 unverdicted novelty 7.0

ClassEval-Pro benchmark shows frontier LLMs achieve at most 45.6% Pass@1 on class-level code tasks, with logic errors (56%) and dependency errors (38%) as dominant failure modes.
Squeez: Task-Conditioned Tool-Output Pruning for Coding Agents
cs.SE 2026-04 unverdicted novelty 7.0

A LoRA-fine-tuned Qwen 3.5 2B model for task-conditioned tool-output pruning reaches 0.86 recall and 0.80 F1 on a new 618-example test set while removing 92% of input tokens and outperforming larger zero-shot models.
REAgent: Requirement-Driven LLM Agents for Software Issue Resolution
cs.SE 2026-04 unverdicted novelty 6.0

REAgent improves LLM patch generation for software issues by 17.4% on average through automated construction, quality checking, and iterative refinement of structured issue-oriented requirements.