SweRank: Software Issue Localization with Code Ranking

· 2025 · cs.SE · arXiv 2505.07849

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

Software issue localization, the task of identifying the precise code locations (files, classes, or functions) relevant to a natural language issue description (e.g., bug report, feature request), is a critical yet time-consuming aspect of software development. While recent LLM-based agentic approaches demonstrate promise, they often incur significant latency and cost due to complex multi-step reasoning and relying on closed-source LLMs. Alternatively, traditional code ranking models, typically optimized for query-to-code or code-to-code retrieval, struggle with the verbose and failure-descriptive nature of issue localization queries. To bridge this gap, we introduce SweRank, an efficient and effective retrieve-and-rerank framework for software issue localization. To facilitate training, we construct SweLoc, a large-scale dataset curated from public GitHub repositories, featuring real-world issue descriptions paired with corresponding code modifications. Empirical results on SWE-Bench-Lite and LocBench show that SweRank achieves state-of-the-art performance, outperforming both prior ranking models and costly agent-based systems using closed-source LLMs like Claude-3.5. Further, we demonstrate SweLoc's utility in enhancing various existing retriever and reranker models for issue localization, establishing the dataset as a valuable resource for the community.

representative citing papers

Neurosymbolic Repo-level Code Localization

cs.SE · 2026-04-17 · unverdicted · novelty 7.0

LogicLoc combines LLMs with Datalog to achieve accurate repo-level code localization without relying on keyword shortcuts in benchmarks.

GALA: Multimodal Graph Alignment for Bug Localization in Automated Program Repair

cs.SE · 2026-04-09 · unverdicted · novelty 6.0

GALA uses hierarchical graph alignment between UI screenshots and code structures to achieve state-of-the-art bug localization in multimodal automated program repair on SWE-bench.

Retrieval-Conditioned Topology Selection with Provable Budget Conservation for Multi-Agent Code Generation

cs.AI · 2026-05-07 · unverdicted · novelty 5.0

RGAO combines retrieval-based complexity assessment with a formal budget algebra to enable dynamic topology selection in multi-agent code generation with provable conservation.

citing papers explorer

Showing 3 of 3 citing papers.

Neurosymbolic Repo-level Code Localization cs.SE · 2026-04-17 · unverdicted · none · ref 23 · internal anchor
LogicLoc combines LLMs with Datalog to achieve accurate repo-level code localization without relying on keyword shortcuts in benchmarks.
GALA: Multimodal Graph Alignment for Bug Localization in Automated Program Repair cs.SE · 2026-04-09 · unverdicted · none · ref 28 · internal anchor
GALA uses hierarchical graph alignment between UI screenshots and code structures to achieve state-of-the-art bug localization in multimodal automated program repair on SWE-bench.
Retrieval-Conditioned Topology Selection with Provable Budget Conservation for Multi-Agent Code Generation cs.AI · 2026-05-07 · unverdicted · none · ref 93 · internal anchor
RGAO combines retrieval-based complexity assessment with a formal budget algebra to enable dynamic topology selection in multi-agent code generation with provable conservation.

SweRank: Software Issue Localization with Code Ranking

fields

years

verdicts

representative citing papers

citing papers explorer