ConCovUp uses static analysis to ground LLM test generation and backward tracing to produce concurrent test drivers that raise average shared-memory access pair coverage from 36.6% to 68.1% on nine real-world libraries.
Chatunitest: A framework for llm-based test generation
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.SE 7roles
background 3representative citing papers
LLMs propose volatile performance improvements on real-world Java tasks that lag human developers on average, showing algorithmic benchmarks overestimate capabilities.
GLMTest integrates code property graphs and GNNs with LLMs to steer test case generation toward targeted branches, raising branch accuracy from 27.4% to 50.2% on the TestGenEval benchmark.
A perturbation method shows memorization advantage in code LLMs varies widely by model and task, remaining low on CVEFixes and Defects4J benchmarks.
AI models generated nearly 16,000 lines of unit tests in hours and enabled safe large-scale refactoring with up to 78% branch coverage in a case study.
Survey mapping LLM applications in software quality assurance to established standards including ISO/IEC 12207, ISO 25010, CMMI, and TMM, with case studies, challenges, and future directions.
Generative AI suitability in qualitative research depends primarily on the approach (small-q positivist/post-positivist or Big Q non-positivist) along with skills, ethics, and personal preferences.
citing papers explorer
-
ConCovUp: Effective Agent-Based Test Driver Generation for Concurrency Testing
ConCovUp uses static analysis to ground LLM test generation and backward tracing to produce concurrent test drivers that raise average shared-memory access pair coverage from 36.6% to 68.1% on nine real-world libraries.
-
Do AI Models Dream of Faster Code? An Empirical Study on LLM-Proposed Performance Improvements in Real-World Software
LLMs propose volatile performance improvements on real-world Java tasks that lag human developers on average, showing algorithmic benchmarks overestimate capabilities.
-
Program Structure-aware Language Models: Targeted Software Testing beyond Textual Semantics
GLMTest integrates code property graphs and GNNs with LLMs to steer test case generation toward targeted branches, raising branch accuracy from 27.4% to 50.2% on the TestGenEval benchmark.
-
Learned or Memorized ? Quantifying Memorization Advantage in Code LLMs
A perturbation method shows memorization advantage in code LLMs varies widely by model and task, remaining low on CVEFixes and Defects4J benchmarks.
-
AI-Assisted Unit Test Writing and Test-Driven Code Refactoring: A Case Study
AI models generated nearly 16,000 lines of unit tests in hours and enabled safe large-scale refactoring with up to 78% branch coverage in a case study.
-
A Blueprint for AI-Driven Software Quality: Integrating LLMs with Established Standards
Survey mapping LLM applications in software quality assurance to established standards including ISO/IEC 12207, ISO 25010, CMMI, and TMM, with case studies, challenges, and future directions.
-
To Vibe Research or Not to Vibe Research? Generative AI in Qualitative Research
Generative AI suitability in qualitative research depends primarily on the approach (small-q positivist/post-positivist or Big Q non-positivist) along with skills, ethics, and personal preferences.