In: 2025 IEEE/ACM 33rd International Conference on Program Comprehension (ICPC), pp

doi:10 · 2025 · arXiv 6645.2025

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

representative citing papers

Repository-Level Solidity Code Generation with Large Language Models: From Prompting to Fine-Tuning

cs.SE · 2026-06-18 · unverdicted · novelty 7.0

Introduces SolidityBench benchmark and SolidityScore metric for repository-level Solidity code generation, finding supervised fine-tuning outperforms prompting, CoT, ICL, and RAG methods on evaluated LLMs.

AgenticFlict: A Large-Scale Dataset of Merge Conflicts in AI Coding Agent Pull Requests on GitHub

cs.SE · 2026-04-04 · accept · novelty 7.0

AgenticFlict is a public dataset of 29K+ textual merge conflicts from AI agent PRs, collected via merge simulation on 107K processed PRs and showing a 27.67% conflict rate with variation across agents.

How AI Coding Agents Modify Code: A Large-Scale Study of GitHub Pull Requests

cs.SE · 2026-01-24 · unverdicted · novelty 7.0

AI coding agents produce pull requests with substantially more commits and slightly higher description-to-diff similarity than human developers, based on analysis of 29,095 merged PRs.

Beyond the Tip of the Iceberg: Understanding SATD in Dockerfiles through the Lens of Co-evolution

cs.SE · 2026-05-20 · unverdicted · novelty 6.0

Analysis of SATD in Dockerfiles shows 27% of admissions and 40% of repayments are coupled to non-Dockerfile artifacts, with coupled events repaid faster overall and external dependencies as a key trigger.

How Robustly do LLMs Understand Execution Semantics?

cs.SE · 2026-02-24 · unverdicted · novelty 6.0

Frontier LLMs like GPT-5.2 show large accuracy drops on perturbed program-output prediction tasks while open-source reasoning models remain more stable, exposing limits in code semantics understanding.

Automation and Reuse Practices in GitHub Actions Workflows: A Practitioner's Perspective

cs.SE · 2026-01-16 · conditional · novelty 6.0

A survey of 419 practitioners shows strong reliance on reusable GitHub Actions for core CI/CD tasks but limited adoption of reusable workflows, with copy-pasting remaining common due to versioning and trust issues.

Multi-LLM Orchestration for High-Quality Code Generation: Exploiting Complementary Model Strengths

cs.SE · 2025-10-01 · conditional · novelty 6.0 · 2 refs

PerfOrch is a four-agent multi-LLM system that uses offline profiling to build language-and-category rankings for routing tasks, achieving 97.19% and 95.83% pass@1 on HumanEval-X and EffiBench-X with generalization across benchmarks.

Tracers for debugging and program exploration

cs.PL · 2026-04-10 · unverdicted · novelty 5.0

Debugging tools should present execution history in time order to support better hypothesis generation about program behavior.

Algorithmic algorithm development with LLMs: A Case Study on LLM-Usage for Contraction Order Optimization in Tensor Networks

cs.AI · 2026-06-01 · unverdicted · novelty 4.0

Case study applies verifier-guided LLM evolutionary agents to contraction-order optimization in tensor networks and concludes that human validation remains essential.

citing papers explorer

Showing 9 of 9 citing papers.

Repository-Level Solidity Code Generation with Large Language Models: From Prompting to Fine-Tuning cs.SE · 2026-06-18 · unverdicted · none · ref 36
Introduces SolidityBench benchmark and SolidityScore metric for repository-level Solidity code generation, finding supervised fine-tuning outperforms prompting, CoT, ICL, and RAG methods on evaluated LLMs.
AgenticFlict: A Large-Scale Dataset of Merge Conflicts in AI Coding Agent Pull Requests on GitHub cs.SE · 2026-04-04 · accept · none · ref 27
AgenticFlict is a public dataset of 29K+ textual merge conflicts from AI agent PRs, collected via merge simulation on 107K processed PRs and showing a 27.67% conflict rate with variation across agents.
How AI Coding Agents Modify Code: A Large-Scale Study of GitHub Pull Requests cs.SE · 2026-01-24 · unverdicted · none · ref 37
AI coding agents produce pull requests with substantially more commits and slightly higher description-to-diff similarity than human developers, based on analysis of 29,095 merged PRs.
Beyond the Tip of the Iceberg: Understanding SATD in Dockerfiles through the Lens of Co-evolution cs.SE · 2026-05-20 · unverdicted · none · ref 63
Analysis of SATD in Dockerfiles shows 27% of admissions and 40% of repayments are coupled to non-Dockerfile artifacts, with coupled events repaid faster overall and external dependencies as a key trigger.
How Robustly do LLMs Understand Execution Semantics? cs.SE · 2026-02-24 · unverdicted · none · ref 12
Frontier LLMs like GPT-5.2 show large accuracy drops on perturbed program-output prediction tasks while open-source reasoning models remain more stable, exposing limits in code semantics understanding.
Automation and Reuse Practices in GitHub Actions Workflows: A Practitioner's Perspective cs.SE · 2026-01-16 · conditional · none · ref 54
A survey of 419 practitioners shows strong reliance on reusable GitHub Actions for core CI/CD tasks but limited adoption of reusable workflows, with copy-pasting remaining common due to versioning and trust issues.
Multi-LLM Orchestration for High-Quality Code Generation: Exploiting Complementary Model Strengths cs.SE · 2025-10-01 · conditional · none · ref 2 · 2 links
PerfOrch is a four-agent multi-LLM system that uses offline profiling to build language-and-category rankings for routing tasks, achieving 97.19% and 95.83% pass@1 on HumanEval-X and EffiBench-X with generalization across benchmarks.
Tracers for debugging and program exploration cs.PL · 2026-04-10 · unverdicted · none · ref 49
Debugging tools should present execution history in time order to support better hypothesis generation about program behavior.
Algorithmic algorithm development with LLMs: A Case Study on LLM-Usage for Contraction Order Optimization in Tensor Networks cs.AI · 2026-06-01 · unverdicted · none · ref 26
Case study applies verifier-guided LLM evolutionary agents to contraction-order optimization in tensor networks and concludes that human validation remains essential.

In: 2025 IEEE/ACM 33rd International Conference on Program Comprehension (ICPC), pp

fields

years

verdicts

representative citing papers

citing papers explorer