PlayCoder raises the rate of LLM-generated GUI apps that can be played end-to-end without logic errors from near zero to 20.3% Play@3 by adding repository-aware generation, agent-driven testing, and iterative repair.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SE 2years
2026 2representative citing papers
WebTestPilot symbolizes GUI elements to infer contextual oracles for end-to-end web testing from natural language specs, reporting 99% task completion and 96% precision/recall on a new bug-injected benchmark.
citing papers explorer
-
PlayCoder: Making LLM-Generated GUI Code Playable
PlayCoder raises the rate of LLM-generated GUI apps that can be played end-to-end without logic errors from near zero to 20.3% Play@3 by adding repository-aware generation, agent-driven testing, and iterative repair.
-
WebTestPilot: Agentic End-to-End Web Testing against Natural Language Specification by Inferring Oracles with Symbolized GUI Elements
WebTestPilot symbolizes GUI elements to infer contextual oracles for end-to-end web testing from natural language specs, reporting 99% task completion and 96% precision/recall on a new bug-injected benchmark.