A preliminary evaluation of llm-based fault localization

Sungmin Kang, Gabin An, Shin Yoo · 2023 · arXiv 2308.05487

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Atropos: Improving Cost-Benefit Trade-off of LLM-based Agents under Self-Consistency with Early Termination and Model Hotswap

cs.SE · 2026-04-16 · unverdicted · novelty 7.0

Atropos uses GCN on inference graphs for early failure prediction and hotswaps to larger LLMs, achieving 74% of large-model performance at 24% cost.

Towards Agentic Runtime Healing

cs.SE · 2024-08-02 · unverdicted · novelty 7.0

Healer uses LLMs to dynamically generate and execute runtime error-handling code, with GPT-4 recovering from 72.8% of errors across four datasets.

TransAgent: Enhancing LLM-Based Code Translation via Fine-Grained Execution Alignment

cs.SE · 2024-09-30 · unverdicted · novelty 5.0

TransAgent improves LLM code translation by up to 33.3% via multi-agent fine-grained execution alignment on a new benchmark of recent tasks.

An Empirical Evaluation of Locally Deployed LLMs for Bug Detection in Python Code

cs.SE · 2026-04-25 · unverdicted · novelty 4.0

Locally deployed LLMs achieve 43-45% accuracy on Python bug detection but frequently produce only partial identifications of problematic code regions.

citing papers explorer

Showing 4 of 4 citing papers.

Atropos: Improving Cost-Benefit Trade-off of LLM-based Agents under Self-Consistency with Early Termination and Model Hotswap cs.SE · 2026-04-16 · unverdicted · none · ref 8
Atropos uses GCN on inference graphs for early failure prediction and hotswaps to larger LLMs, achieving 74% of large-model performance at 24% cost.
Towards Agentic Runtime Healing cs.SE · 2024-08-02 · unverdicted · none · ref 28
Healer uses LLMs to dynamically generate and execute runtime error-handling code, with GPT-4 recovering from 72.8% of errors across four datasets.
TransAgent: Enhancing LLM-Based Code Translation via Fine-Grained Execution Alignment cs.SE · 2024-09-30 · unverdicted · none · ref 61
TransAgent improves LLM code translation by up to 33.3% via multi-agent fine-grained execution alignment on a new benchmark of recent tasks.
An Empirical Evaluation of Locally Deployed LLMs for Bug Detection in Python Code cs.SE · 2026-04-25 · unverdicted · none · ref 9
Locally deployed LLMs achieve 43-45% accuracy on Python bug detection but frequently produce only partial identifications of problematic code regions.

A preliminary evaluation of llm-based fault localization

fields

years

verdicts

representative citing papers

citing papers explorer