Chatting with GPT-3 for zero-shot human-like mobile automated GUI testing

Zhe Liu, Chunyang Chen, Junjie Wang, Mengzhuo Chen, Boyu Wu, Xing Che, Dandan Wang, Qing Wang · 2023 · arXiv 2305.09434

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

PlayCoder: Making LLM-Generated GUI Code Playable

cs.SE · 2026-04-21 · conditional · novelty 7.0

PlayCoder raises the rate of LLM-generated GUI apps that can be played end-to-end without logic errors from near zero to 20.3% Play@3 by adding repository-aware generation, agent-driven testing, and iterative repair.

WebTestPilot: Agentic End-to-End Web Testing against Natural Language Specification by Inferring Oracles with Symbolized GUI Elements

cs.SE · 2026-02-12 · unverdicted · novelty 7.0

WebTestPilot symbolizes GUI elements to infer contextual oracles for end-to-end web testing from natural language specs, reporting 99% task completion and 96% precision/recall on a new bug-injected benchmark.

Proactive Detection of GUI Defects in Multi-Window Scenarios via Multimodal Reasoning

cs.SE · 2026-04-21 · unverdicted · novelty 6.0

Proactive multi-window state triggering plus Set-of-Mark alignment and multimodal LLM reasoning detects GUI defects in Android apps, reporting 184% more text truncation, 87.2% F1 on occlusion, and 40 defect-prone apps at 10% FPR.

DynamicsLLM: a Dynamic Analysis-based Tool for Generating Intelligent Execution Traces Using LLMs to Detect Android Behavioural Code Smells

cs.SE · 2026-04-12 · unverdicted · novelty 6.0

DynamicsLLM uses LLMs to generate execution traces that cover three times more code smell-related events than the prior Dynamics tool on 333 F-Droid Android apps, with a hybrid method adding 25.9% coverage for low-activity apps.

citing papers explorer

Showing 4 of 4 citing papers.

PlayCoder: Making LLM-Generated GUI Code Playable cs.SE · 2026-04-21 · conditional · none · ref 45
PlayCoder raises the rate of LLM-generated GUI apps that can be played end-to-end without logic errors from near zero to 20.3% Play@3 by adding repository-aware generation, agent-driven testing, and iterative repair.
WebTestPilot: Agentic End-to-End Web Testing against Natural Language Specification by Inferring Oracles with Symbolized GUI Elements cs.SE · 2026-02-12 · unverdicted · none · ref 39
WebTestPilot symbolizes GUI elements to infer contextual oracles for end-to-end web testing from natural language specs, reporting 99% task completion and 96% precision/recall on a new bug-injected benchmark.
Proactive Detection of GUI Defects in Multi-Window Scenarios via Multimodal Reasoning cs.SE · 2026-04-21 · unverdicted · none · ref 21
Proactive multi-window state triggering plus Set-of-Mark alignment and multimodal LLM reasoning detects GUI defects in Android apps, reporting 184% more text truncation, 87.2% F1 on occlusion, and 40 defect-prone apps at 10% FPR.
DynamicsLLM: a Dynamic Analysis-based Tool for Generating Intelligent Execution Traces Using LLMs to Detect Android Behavioural Code Smells cs.SE · 2026-04-12 · unverdicted · none · ref 28
DynamicsLLM uses LLMs to generate execution traces that cover three times more code smell-related events than the prior Dynamics tool on 333 F-Droid Android apps, with a hybrid method adding 25.9% coverage for low-activity apps.

Chatting with GPT-3 for zero-shot human-like mobile automated GUI testing

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer