Evaluating LLMs Ef- fectiveness in Detecting and Correcting Test Smells

· 2025 · arXiv 2506.07594

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

How Compliant Are GitHub Actions Workflows? A Checklist-Based Study with LLM-Assisted Auditing

cs.SE · 2026-05-03 · accept · novelty 6.0

GitHub Actions workflows achieve only 28% overall compliance with best practices, with LLMs enabling an 81% reduction in verification effort via hybrid adjudication but still requiring expert oversight for security judgments.

An Empirical Evaluation of Locally Deployed LLMs for Bug Detection in Python Code

cs.SE · 2026-04-25 · unverdicted · novelty 4.0

Locally deployed LLMs achieve 43-45% accuracy on Python bug detection but frequently produce only partial identifications of problematic code regions.

citing papers explorer

Showing 2 of 2 citing papers.

How Compliant Are GitHub Actions Workflows? A Checklist-Based Study with LLM-Assisted Auditing cs.SE · 2026-05-03 · accept · none · ref 35
GitHub Actions workflows achieve only 28% overall compliance with best practices, with LLMs enabling an 81% reduction in verification effort via hybrid adjudication but still requiring expert oversight for security judgments.
An Empirical Evaluation of Locally Deployed LLMs for Bug Detection in Python Code cs.SE · 2026-04-25 · unverdicted · none · ref 11
Locally deployed LLMs achieve 43-45% accuracy on Python bug detection but frequently produce only partial identifications of problematic code regions.

Evaluating LLMs Ef- fectiveness in Detecting and Correcting Test Smells

fields

years

verdicts

representative citing papers

citing papers explorer