Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering , pages =

Xie, Mulong, Feng, Sidong, Xing, Zhenchang, Chen, Jieshan, Chen, Chunyang , title = · 2020 · arXiv 8089.341794

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

PBT-Bench: Benchmarking AI Agents on Property-Based Testing

cs.SE · 2026-05-13 · unverdicted · novelty 7.0 · 3 refs

PBT-Bench is a new benchmark with 100 property-based testing problems across 40 Python libraries that measures LLM bug recall rates of 42.1-83.4% under guided prompting versus 31.4-76.7% in baseline.

UniCoder: Unified Visual-to-Code Generation via Symbolic Rewards and Reference-Guided Code Optimization

cs.CV · 2026-06-30 · unverdicted · novelty 6.0

UniCoder applies symbolic attribute alignment via an auxiliary LLM and reference-guided optimization in RL to achieve SOTA visual-to-code generation on ChartMimic, UniSVG, Design2Code, and ScreenBench.

CAPED: Context-Aware Privacy Exposure Defense for Mobile GUI Agents

cs.CR · 2026-06-10 · unverdicted · novelty 6.0

CAPED reduces incidental visual privacy leakage in mobile GUI agents from 0.766 to 0.268 on seeded AndroidWorld tasks by selectively exposing only task-relevant screen content.

cs.SE · 2026-05-08 · unverdicted · novelty 6.0

SPARK improves LLM-based test code fault localization by retrieving similar past faults and selectively annotating suspicious lines in new failing tests.

All Green, Still Broken: Real-Flow Verification Lessons from an LLM-Integrated, Multi-Market Web Application

cs.SE · 2026-06-21 · unverdicted · novelty 5.0

Analysis of 252 bug fixes in an LLM-powered multi-market web app found 44% escaped through four seams invisible to component unit tests, motivating a four-seam verification framework.

LLM vs. Human Unit Tests: Fault Detection on Real Python Bugs

cs.SE · 2026-06-07 · unverdicted · novelty 5.0

LLM-generated unit tests with retrieval-augmented context detect faults in 69% of real Python bugs versus 17.2% for general-purpose human-written tests, with similar coverage levels.

MultiMend: Multilingual Program Repair with Context Augmentation and Multi-Hunk Patch Generation

cs.SE · 2025-01-27 · unverdicted · novelty 4.0

MultiMend augments buggy function context via retrieval and generates multi-hunk patches, fixing 2,227 of 5,501 bugs across six benchmarks in four languages.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering , pages =

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer