InProceedings of the IEEE/ACM International Conference on Software Engineering (ICSE)

Jialun Cao, Yuk-Kit Chan, Zixuan Ling, Wenxuan Wang, Shuqing Li, Mingwei Liu, Chaozheng Wang, Boxi Yu, Pinjia He, Shuai Wang, et al · 2025 · arXiv 2501.10711

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

Edit, But Verify: An Empirical Audit of Instructed Code-Editing Benchmarks

cs.SE · 2026-04-06 · conditional · novelty 8.0

The two main benchmarks for LLM instructed code editing over-represent Python, miss common real-world domains and edit types, and have test coverage issues that limit what they measure.

Guidelines for Empirical Studies in Software Engineering involving Large Language Models

cs.SE · 2025-08-21 · accept · novelty 7.0 · 2 refs

The paper delivers a taxonomy of seven LLM study types in software engineering along with eight guidelines that separate mandatory requirements from recommended practices to address reproducibility challenges.

Knowledge-Graph-Driven Data Synthesis for Low-Resource Software Development: A HarmonyOS Case Study

cs.SE · 2025-11-29 · unverdicted · novelty 6.0

APIKG4Syn synthesizes API-oriented training data via knowledge graphs and Monte Carlo search to fine-tune a 7B model that reaches 25% pass@1 on HarmonyOS code generation, beating untuned GPT-4o at 17.59%.

Beyond the Leaderboard: Rethinking Medical Benchmarks for Large Language Models

cs.CL · 2025-08-06 · unverdicted · novelty 6.0

MedCheck is a lifecycle checklist framework that audits 53 existing medical LLM benchmarks and identifies systemic gaps in clinical fidelity, contamination control, and safety metrics.

Across Programming Language Silos: A Study on Cross-Lingual Retrieval-augmented Code Generation

cs.SE · 2025-06-04 · accept · novelty 6.0

Cross-lingual RACG shows non-trivial but unequal knowledge transfer across 13 programming languages, depending on linguistic affinity and pretraining diversity, with limited reliance on natural language information when using code-specific retrievers.

citing papers explorer

Showing 5 of 5 citing papers.

Edit, But Verify: An Empirical Audit of Instructed Code-Editing Benchmarks cs.SE · 2026-04-06 · conditional · none · ref 4
The two main benchmarks for LLM instructed code editing over-represent Python, miss common real-world domains and edit types, and have test coverage issues that limit what they measure.
Guidelines for Empirical Studies in Software Engineering involving Large Language Models cs.SE · 2025-08-21 · accept · none · ref 19 · 2 links
The paper delivers a taxonomy of seven LLM study types in software engineering along with eight guidelines that separate mandatory requirements from recommended practices to address reproducibility challenges.
Knowledge-Graph-Driven Data Synthesis for Low-Resource Software Development: A HarmonyOS Case Study cs.SE · 2025-11-29 · unverdicted · none · ref 5
APIKG4Syn synthesizes API-oriented training data via knowledge graphs and Monte Carlo search to fine-tune a 7B model that reaches 25% pass@1 on HarmonyOS code generation, beating untuned GPT-4o at 17.59%.
Beyond the Leaderboard: Rethinking Medical Benchmarks for Large Language Models cs.CL · 2025-08-06 · unverdicted · none · ref 2
MedCheck is a lifecycle checklist framework that audits 53 existing medical LLM benchmarks and identifies systemic gaps in clinical fidelity, contamination control, and safety metrics.
Across Programming Language Silos: A Study on Cross-Lingual Retrieval-augmented Code Generation cs.SE · 2025-06-04 · accept · none · ref 23
Cross-lingual RACG shows non-trivial but unequal knowledge transfer across 13 programming languages, depending on linguistic affinity and pretraining diversity, with limited reliance on natural language information when using code-specific retrievers.

InProceedings of the IEEE/ACM International Conference on Software Engineering (ICSE)

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer