hub

Unit test case generation with transformers and focal context

· 2009 · arXiv 2009.05617

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

read on arXiv browse 13 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

From Exploration to Specification: LLM-Based Property Generation for Mobile App Testing

cs.SE · 2026-04-15 · unverdicted · novelty 7.0

PropGen automates property generation for Android app testing via LLM synthesis from guided exploration and feedback refinement, yielding 912 valid properties and 25 previously unknown bugs across 12 apps.

An Iterative Test-and-Repair Framework for Competitive Code Generation

cs.SE · 2026-04-07 · unverdicted · novelty 7.0

FixAudit improves LLM code generation on competitive programming benchmarks by training a shared model for iterative code-aware test generation and repair, achieving 35%+ gains in Pass@1 over baselines on the same 7B model.

CodeT: Code Generation with Generated Tests

cs.CL · 2022-07-21 · conditional · novelty 7.0

CodeT improves code generation accuracy by using the same model to create test cases and then selecting solutions via output agreement on those tests, raising HumanEval pass@1 from 47% to 65.8%.

FeedbackLLM: Metadata driven Multi-Agentic Language Agnostic Test Case Generator with Evolving prompt and Coverage Feedback

cs.SE · 2026-05-02 · unverdicted · novelty 6.0

FeedbackLLM uses line and branch coverage feedback agents in an iterative multi-agent process with a redundancy cache to generate test cases achieving higher coverage than baselines on standard C and Python benchmarks while scaling linearly in time.

Generalizing Test Cases for Comprehensive Test Scenario Coverage

cs.SE · 2026-04-23 · unverdicted · novelty 6.0

TestGeneralizer generalizes an initial test into a set of executable tests covering more diverse scenarios, delivering +31.66% mutation-based and +23.08% LLM-assessed scenario coverage gains over ChatTester on 12 open-source Java projects.

Enhancing Program Repair with Specification Guidance and Intermediate Behavioral Signals

cs.SE · 2026-04-13 · unverdicted · novelty 6.0

SpecTune improves LLM-based automated program repair by deriving localized postconditions at execution checkpoints and using alpha and beta signals to produce precise fault-localization and patch-generation guidance.

TestDecision: Sequential Test Suite Generation via Greedy Optimization and Reinforcement Learning

cs.SE · 2026-04-02 · unverdicted · novelty 6.0

By proving test suite coverage is monotone submodular and training LLMs with RL to maximize marginal gains, TestDecision improves branch coverage 38-52% and bug detection up to 95% over base models on ULT and LiveCodeBench.

Generating Project-Specific Test Cases with Requirement Validation Intention

cs.SE · 2025-07-28 · unverdicted · novelty 6.0

IntentionTest retrieves a reusable test from the project and edits it with an LLM to match a supplied validation intention, yielding tests that kill 28.1-37.6% more mutants, share 16.9-23.9% more coverage, and produce 23.7-49.0% more passing tests than four baselines on 3,680 cases.

Effective LLM Code Refinement via Property-Oriented and Structurally Minimal Feedback

cs.SE · 2025-06-23 · unverdicted · novelty 6.0

PGS generates property-oriented, structurally minimal feedback from high-level program properties to refine LLM code, yielding up to 13.4% pass@1 gains and 1.4-1.6x higher bug-fix rates than prior TDD and debugging baselines.

CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

cs.SE · 2021-02-09 · unverdicted · novelty 6.0

CodeXGLUE supplies a standardized collection of 10 code-related tasks, 14 datasets, an evaluation platform, and BERT-, GPT-, and encoder-decoder-style baselines.

PPO guided Agentic Pipeline for Adaptive Prompt Selection and Test Case Generation

cs.SE · 2026-05-01 · unverdicted · novelty 5.0

PPO-LLM adaptively selects among eight prompting techniques using an 11-dimensional state vector to guide an LLM toward higher branch and line coverage than static baselines on 20 benchmark programs.

StarCoder: may the source be with you!

cs.CL · 2023-05-09 · accept · novelty 5.0

StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.

CodePori: Large-Scale System for Autonomous Software Development Using Multi-Agent Technology

cs.SE · 2024-02-02 · unverdicted · novelty 4.0

CodePori is a multi-agent LLM system for code generation whose participant evaluation identifies practical challenges like memory limits and hallucinations missed by binary benchmarks.

citing papers explorer

Showing 13 of 13 citing papers.

From Exploration to Specification: LLM-Based Property Generation for Mobile App Testing cs.SE · 2026-04-15 · unverdicted · none · ref 58
PropGen automates property generation for Android app testing via LLM synthesis from guided exploration and feedback refinement, yielding 912 valid properties and 25 previously unknown bugs across 12 apps.
An Iterative Test-and-Repair Framework for Competitive Code Generation cs.SE · 2026-04-07 · unverdicted · none · ref 54
FixAudit improves LLM code generation on competitive programming benchmarks by training a shared model for iterative code-aware test generation and repair, achieving 35%+ gains in Pass@1 over baselines on the same 7B model.
CodeT: Code Generation with Generated Tests cs.CL · 2022-07-21 · conditional · none · ref 13
CodeT improves code generation accuracy by using the same model to create test cases and then selecting solutions via output agreement on those tests, raising HumanEval pass@1 from 47% to 65.8%.
FeedbackLLM: Metadata driven Multi-Agentic Language Agnostic Test Case Generator with Evolving prompt and Coverage Feedback cs.SE · 2026-05-02 · unverdicted · none · ref 24
FeedbackLLM uses line and branch coverage feedback agents in an iterative multi-agent process with a redundancy cache to generate test cases achieving higher coverage than baselines on standard C and Python benchmarks while scaling linearly in time.
Generalizing Test Cases for Comprehensive Test Scenario Coverage cs.SE · 2026-04-23 · unverdicted · none · ref 39
TestGeneralizer generalizes an initial test into a set of executable tests covering more diverse scenarios, delivering +31.66% mutation-based and +23.08% LLM-assessed scenario coverage gains over ChatTester on 12 open-source Java projects.
Enhancing Program Repair with Specification Guidance and Intermediate Behavioral Signals cs.SE · 2026-04-13 · unverdicted · none · ref 40
SpecTune improves LLM-based automated program repair by deriving localized postconditions at execution checkpoints and using alpha and beta signals to produce precise fault-localization and patch-generation guidance.
TestDecision: Sequential Test Suite Generation via Greedy Optimization and Reinforcement Learning cs.SE · 2026-04-02 · unverdicted · none · ref 55
By proving test suite coverage is monotone submodular and training LLMs with RL to maximize marginal gains, TestDecision improves branch coverage 38-52% and bug detection up to 95% over base models on ULT and LiveCodeBench.
Generating Project-Specific Test Cases with Requirement Validation Intention cs.SE · 2025-07-28 · unverdicted · none · ref 49
IntentionTest retrieves a reusable test from the project and edits it with an LLM to match a supplied validation intention, yielding tests that kill 28.1-37.6% more mutants, share 16.9-23.9% more coverage, and produce 23.7-49.0% more passing tests than four baselines on 3,680 cases.
Effective LLM Code Refinement via Property-Oriented and Structurally Minimal Feedback cs.SE · 2025-06-23 · unverdicted · none · ref 55
PGS generates property-oriented, structurally minimal feedback from high-level program properties to refine LLM code, yielding up to 13.4% pass@1 gains and 1.4-1.6x higher bug-fix rates than prior TDD and debugging baselines.
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation cs.SE · 2021-02-09 · unverdicted · none · ref 78
CodeXGLUE supplies a standardized collection of 10 code-related tasks, 14 datasets, an evaluation platform, and BERT-, GPT-, and encoder-decoder-style baselines.
PPO guided Agentic Pipeline for Adaptive Prompt Selection and Test Case Generation cs.SE · 2026-05-01 · unverdicted · none · ref 23
PPO-LLM adaptively selects among eight prompting techniques using an 11-dimensional state vector to guide an LLM toward higher branch and line coverage than static baselines on 20 benchmark programs.
StarCoder: may the source be with you! cs.CL · 2023-05-09 · accept · none · ref 187
StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.
CodePori: Large-Scale System for Autonomous Software Development Using Multi-Agent Technology cs.SE · 2024-02-02 · unverdicted · none · ref 51
CodePori is a multi-agent LLM system for code generation whose participant evaluation identifies practical challenges like memory limits and hallucinations missed by binary benchmarks.

Unit test case generation with transformers and focal context

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer