D ebug B ench: Evaluating debugging capability of large language models

Runchu Tian, Yining Ye, Yujia Qin, Xin Cong, Yankai Lin, Yinxu Pan, Yesai Wu, Hui Haotian, Liu Weichuan, Zhiyuan Liu, Maosong Sun · 2024 · DOI 10.18653/v1/2024.findings-acl.247

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

CASCADE: Detecting Inconsistencies between Code and Documentation with Automatic Test Generation

cs.SE · 2026-04-21 · unverdicted · novelty 7.0

CASCADE finds code-documentation mismatches by running LLM-generated tests from docs and confirming failure only when documentation-derived code succeeds on the same test.

Wiring the 'Why': A Unified Taxonomy and Survey of Abductive Reasoning in LLMs

cs.AI · 2026-04-09 · accept · novelty 7.0

The paper delivers the first survey of abductive reasoning in LLMs, a unified two-stage taxonomy, a compact benchmark, and an analysis of gaps relative to deductive and inductive reasoning.

citing papers explorer

Showing 2 of 2 citing papers.

CASCADE: Detecting Inconsistencies between Code and Documentation with Automatic Test Generation cs.SE · 2026-04-21 · unverdicted · none · ref 41
CASCADE finds code-documentation mismatches by running LLM-generated tests from docs and confirming failure only when documentation-derived code succeeds on the same test.
Wiring the 'Why': A Unified Taxonomy and Survey of Abductive Reasoning in LLMs cs.AI · 2026-04-09 · accept · none · ref 97
The paper delivers the first survey of abductive reasoning in LLMs, a unified two-stage taxonomy, a compact benchmark, and an analysis of gaps relative to deductive and inductive reasoning.

D ebug B ench: Evaluating debugging capability of large language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer