hub Canonical reference

In 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023, Melbourne, Australia, May 14-20

Sungmin Kang, Juyeon Yoon, Shin Yoo · 2023 · arXiv 8619.2023

Canonical reference. 76% of citing Pith papers cite this work as background.

83 Pith papers citing it

Background 76% of classified citations

read on arXiv browse 83 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 32 baseline 2 method 2 extension 1 other 1

citation-polarity summary

background 29 support 3 baseline 2 use method 2 extend 1 unclear 1

co-cited works

representative citing papers

Autoregressive, Yet Revisable: In Decoding Revision for Secure Code Generation

cs.SE · 2026-02-01 · unverdicted · novelty 8.0

Stream of Revision adds action tokens to LLM decoding so the model can revise its own code history on the fly, cutting vulnerabilities in generated code with little added cost.

RepairAgent: An Autonomous, LLM-Based Agent for Program Repair

cs.SE · 2024-03-25 · conditional · novelty 8.0 · 2 refs

RepairAgent autonomously repairs 164 bugs on Defects4J including 39 not fixed by prior techniques by treating an LLM as an agent that invokes tools via a finite state machine and dynamic prompts.

BioDefect: The First Dataset for Defect Detection in Bioinformatics Software

cs.SE · 2026-05-20 · unverdicted · novelty 7.0 · 3 refs

BioDefect is a new dataset for defect detection in bioinformatics software that improves average F1-scores by 29.61% to 38.04% over existing datasets when evaluated on nine language models.

Code Generation by Differential Test Time Scaling

cs.SE · 2026-05-19 · unverdicted · novelty 7.0

DiffCodeGen clusters code candidates by behavioral similarity from fuzzing-synthesized inputs and selects the largest cluster's medoid, matching or exceeding prior test-time scaling methods with far less token and time cost.

Hydra: Efficient, Correct Code Generation via Checkpoint-and-Rollback Support

cs.SE · 2026-05-14 · unverdicted · novelty 7.0

Hydra enables asynchronous static error checking and targeted checkpoint-rollback repair during LLM code generation, cutting latency by up to 71% and token use by up to 70% versus post-hoc repair on C/C++ tasks.

Quantifying Sensitivity for Tree Ensembles: A symbolic and compositional approach

cs.AI · 2026-05-13 · unverdicted · novelty 7.0

A compositional algebraic decision diagram algorithm quantifies sensitivity in decision tree ensembles with certified error and confidence bounds, outperforming model counters on benchmarks.

The Death Spiral of Open Source Projects: A Post-Mortem Analysis of Pull Request Workflow Dynamics

cs.SE · 2026-05-12 · unverdicted · novelty 7.0

Large-scale analysis of inactive GitHub repositories shows open source projects die primarily from insufficient value and ecosystem dynamics, not from pull request workflow problems, despite a common pattern of declining activity.

Breaking the Dependency Chaos: A Constraint-Driven Python Dependency Resolution Strategy with Selective LLM Imputation

cs.SE · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

SMT-LLM builds a constraint graph from PyPI metadata and AST-derived imports, solves it with Z3, and uses LLM imputation only when needed, resolving 83.6% of HG2.9K snippets versus PLLM's 54.8% while cutting median time by 6.3x and LLM calls by 11x.

ConCovUp: Effective Agent-Based Test Driver Generation for Concurrency Testing

cs.SE · 2026-05-10 · unverdicted · novelty 7.0

ConCovUp uses static analysis to ground LLM test generation and backward tracing to produce concurrent test drivers that raise average shared-memory access pair coverage from 36.6% to 68.1% on nine real-world libraries.

Debugging the Debuggers: Failure-Anchored Structured Recovery for Software Engineering Agents

cs.SE · 2026-05-09 · unverdicted · novelty 7.0

PROBE structures runtime telemetry into diagnoses and evidence-grounded guidance, raising recovery rates by 12.45 points over baselines on 257 unresolved software repair and AIOps cases.

SmellBench: Evaluating LLM Agents on Architectural Code Smell Repair

cs.SE · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

SmellBench is the first benchmark showing LLM agents resolve 47.7% of architectural code smells while accurately spotting false positives, but aggressive repairs often introduce new smells and degrade overall quality.

VulKey: Automated Vulnerability Repair Guided by Domain-Specific Repair Patterns

cs.CR · 2026-05-03 · unverdicted · novelty 7.0

VulKey reaches 31.5% repair accuracy on real C/C++ vulnerabilities by matching hierarchical expert patterns to guide LLM patch generation, beating prior baselines by 7.6%.

ClozeMaster: Fuzzing Rust Compiler by Harnessing LLMs for Infilling Masked Real Programs

cs.SE · 2026-05-01 · unverdicted · novelty 7.0

ClozeMaster masks bracketed structures in historical Rust bug code and uses LLMs to infill them, generating test programs that discovered 27 confirmed bugs in rustc and mrustc while outperforming existing fuzzers.

Single-Language Evidence Is Insufficient for Automated Logging: A Multilingual Benchmark and Empirical Study with LLMs

cs.SE · 2026-04-19 · unverdicted · novelty 7.0 · 2 refs

MultiLogBench shows that LLM performance on automated logging varies substantially across programming languages, demonstrating that single-language evidence is insufficient for general claims about model behavior or tool design.

Isolating Recurring Execution-Dependent Abnormal Patterns on NISQ Quantum Devices

cs.SE · 2026-04-19 · unverdicted · novelty 7.0

QRisk isolates backend-specific abnormal error patterns on NISQ devices via delta debugging and mitigates them with commuting gate swaps, cutting excess noise by 24-45% on IBM backends where noise models predict no difference.

Clover: A Neural-Symbolic Agentic Harness with Stochastic Tree-of-Thoughts for Verified RTL Repair

cs.AR · 2026-04-19 · unverdicted · novelty 7.0 · 2 refs

Clover fixes 96.8% of bugs on an RTL-repair benchmark using stochastic tree-of-thoughts and neural-symbolic agents, outperforming traditional and LLM baselines by 94% and 63% respectively with 87.5% pass@1.

Towards Personalizing Secure Programming Education with LLM-Injected Vulnerabilities

cs.CR · 2026-04-15 · conditional · novelty 7.0

LLM agents inject CWEs into student-authored code to generate personalized security examples; in a 71-student deployment, participants rated them more relevant than textbook cases but quantitative differences remained limited.

CIR+CVN: Bridging LLM Semantic Understanding and Petri-Net Verification for Concurrent Programs

cs.PL · 2026-04-10 · unverdicted · novelty 7.0 · 2 refs

An LLM synthesizes an alias-free concurrency model (CIR) from natural language that is translated to a Petri net (CVN) for exhaustive verification and targeted repair, with goal-reachability checks to avoid incomplete fixes.

REAP: Automatic Curation of Coding Agent Benchmarks from Interactive Production Usage

cs.SE · 2026-04-02 · unverdicted · novelty 7.0

REAP automatically curates production-derived benchmarks for AI coding agents via LLM classification and stability checks, producing the Harvest benchmark with model solve rates of 42.9-58.2%.

Measuring and Exploiting Contextual Bias in LLM-Assisted Security Code Review

cs.SE · 2026-03-19 · accept · novelty 7.0

LLM-based security code review is vulnerable to framing bias, with a novel iterative refinement attack achieving 100% success in reintroducing vulnerabilities across real projects.

Can Language Models Go Beyond Coding? Assessing the Capability of Language Models to Build Real-World Systems

cs.SE · 2025-11-02 · unverdicted · novelty 7.0

Build-bench is the first architecture-aware benchmark that evaluates LLMs on repairing cross-ISA build failures via iterative tool-augmented reasoning, with the best model reaching 63.19% success.

Do AI Models Dream of Faster Code? An Empirical Study on LLM-Proposed Performance Improvements in Real-World Software

cs.SE · 2025-10-17 · unverdicted · novelty 7.0

LLMs propose volatile performance improvements on real-world Java tasks that lag human developers on average, showing algorithmic benchmarks overestimate capabilities.

ContractEval: A Benchmark for Evaluating Contract-Satisfying Assertions in Code Generation

cs.AI · 2025-10-14 · unverdicted · novelty 7.0

ContractEval benchmark on 364 tasks shows code LLMs achieve 75-82% functional pass@1 but 0% contract satisfaction under standard prompting, rising only to 23-41% with explicit contracts.

ML Code Smells: From Specification to Detection

cs.SE · 2025-09-24 · unverdicted · novelty 7.0

SpecDetect4ML detects 22 ML code smells via DSL specifications and CPG-based analysis, reporting 95.82% precision and 88.14% recall on 890 ML systems while outperforming prior tools.

citing papers explorer

Showing 50 of 83 citing papers.

Autoregressive, Yet Revisable: In Decoding Revision for Secure Code Generation cs.SE · 2026-02-01 · unverdicted · none · ref 3
Stream of Revision adds action tokens to LLM decoding so the model can revise its own code history on the fly, cutting vulnerabilities in generated code with little added cost.
RepairAgent: An Autonomous, LLM-Based Agent for Program Repair cs.SE · 2024-03-25 · conditional · none · ref 19 · 2 links
RepairAgent autonomously repairs 164 bugs on Defects4J including 39 not fixed by prior techniques by treating an LLM as an agent that invokes tools via a finite state machine and dynamic prompts.
BioDefect: The First Dataset for Defect Detection in Bioinformatics Software cs.SE · 2026-05-20 · unverdicted · none · ref 20 · 3 links
BioDefect is a new dataset for defect detection in bioinformatics software that improves average F1-scores by 29.61% to 38.04% over existing datasets when evaluated on nine language models.
Code Generation by Differential Test Time Scaling cs.SE · 2026-05-19 · unverdicted · none · ref 79
DiffCodeGen clusters code candidates by behavioral similarity from fuzzing-synthesized inputs and selects the largest cluster's medoid, matching or exceeding prior test-time scaling methods with far less token and time cost.
Hydra: Efficient, Correct Code Generation via Checkpoint-and-Rollback Support cs.SE · 2026-05-14 · unverdicted · none · ref 12
Hydra enables asynchronous static error checking and targeted checkpoint-rollback repair during LLM code generation, cutting latency by up to 71% and token use by up to 70% versus post-hoc repair on C/C++ tasks.
Quantifying Sensitivity for Tree Ensembles: A symbolic and compositional approach cs.AI · 2026-05-13 · unverdicted · none · ref 4
A compositional algebraic decision diagram algorithm quantifies sensitivity in decision tree ensembles with certified error and confidence bounds, outperforming model counters on benchmarks.
The Death Spiral of Open Source Projects: A Post-Mortem Analysis of Pull Request Workflow Dynamics cs.SE · 2026-05-12 · unverdicted · none · ref 52
Large-scale analysis of inactive GitHub repositories shows open source projects die primarily from insufficient value and ecosystem dynamics, not from pull request workflow problems, despite a common pattern of declining activity.
Breaking the Dependency Chaos: A Constraint-Driven Python Dependency Resolution Strategy with Selective LLM Imputation cs.SE · 2026-05-12 · unverdicted · none · ref 11 · 2 links
SMT-LLM builds a constraint graph from PyPI metadata and AST-derived imports, solves it with Z3, and uses LLM imputation only when needed, resolving 83.6% of HG2.9K snippets versus PLLM's 54.8% while cutting median time by 6.3x and LLM calls by 11x.
ConCovUp: Effective Agent-Based Test Driver Generation for Concurrency Testing cs.SE · 2026-05-10 · unverdicted · none · ref 42
ConCovUp uses static analysis to ground LLM test generation and backward tracing to produce concurrent test drivers that raise average shared-memory access pair coverage from 36.6% to 68.1% on nine real-world libraries.
Debugging the Debuggers: Failure-Anchored Structured Recovery for Software Engineering Agents cs.SE · 2026-05-09 · unverdicted · none · ref 1
PROBE structures runtime telemetry into diagnoses and evidence-grounded guidance, raising recovery rates by 12.45 points over baselines on 257 unresolved software repair and AIOps cases.
SmellBench: Evaluating LLM Agents on Architectural Code Smell Repair cs.SE · 2026-05-07 · unverdicted · none · ref 27 · 2 links
SmellBench is the first benchmark showing LLM agents resolve 47.7% of architectural code smells while accurately spotting false positives, but aggressive repairs often introduce new smells and degrade overall quality.
VulKey: Automated Vulnerability Repair Guided by Domain-Specific Repair Patterns cs.CR · 2026-05-03 · unverdicted · none · ref 39
VulKey reaches 31.5% repair accuracy on real C/C++ vulnerabilities by matching hierarchical expert patterns to guide LLM patch generation, beating prior baselines by 7.6%.
ClozeMaster: Fuzzing Rust Compiler by Harnessing LLMs for Infilling Masked Real Programs cs.SE · 2026-05-01 · unverdicted · none · ref 38
ClozeMaster masks bracketed structures in historical Rust bug code and uses LLMs to infill them, generating test programs that discovered 27 confirmed bugs in rustc and mrustc while outperforming existing fuzzers.
Single-Language Evidence Is Insufficient for Automated Logging: A Multilingual Benchmark and Empirical Study with LLMs cs.SE · 2026-04-19 · unverdicted · none · ref 14 · 2 links
MultiLogBench shows that LLM performance on automated logging varies substantially across programming languages, demonstrating that single-language evidence is insufficient for general claims about model behavior or tool design.
Isolating Recurring Execution-Dependent Abnormal Patterns on NISQ Quantum Devices cs.SE · 2026-04-19 · unverdicted · none · ref 27
QRisk isolates backend-specific abnormal error patterns on NISQ devices via delta debugging and mitigates them with commuting gate swaps, cutting excess noise by 24-45% on IBM backends where noise models predict no difference.
Clover: A Neural-Symbolic Agentic Harness with Stochastic Tree-of-Thoughts for Verified RTL Repair cs.AR · 2026-04-19 · unverdicted · none · ref 8 · 2 links
Clover fixes 96.8% of bugs on an RTL-repair benchmark using stochastic tree-of-thoughts and neural-symbolic agents, outperforming traditional and LLM baselines by 94% and 63% respectively with 87.5% pass@1.
Towards Personalizing Secure Programming Education with LLM-Injected Vulnerabilities cs.CR · 2026-04-15 · conditional · none · ref 20
LLM agents inject CWEs into student-authored code to generate personalized security examples; in a 71-student deployment, participants rated them more relevant than textbook cases but quantitative differences remained limited.
CIR+CVN: Bridging LLM Semantic Understanding and Petri-Net Verification for Concurrent Programs cs.PL · 2026-04-10 · unverdicted · none · ref 12 · 2 links
An LLM synthesizes an alias-free concurrency model (CIR) from natural language that is translated to a Petri net (CVN) for exhaustive verification and targeted repair, with goal-reachability checks to avoid incomplete fixes.
REAP: Automatic Curation of Coding Agent Benchmarks from Interactive Production Usage cs.SE · 2026-04-02 · unverdicted · none · ref 14
REAP automatically curates production-derived benchmarks for AI coding agents via LLM classification and stability checks, producing the Harvest benchmark with model solve rates of 42.9-58.2%.
Measuring and Exploiting Contextual Bias in LLM-Assisted Security Code Review cs.SE · 2026-03-19 · accept · none · ref 14
LLM-based security code review is vulnerable to framing bias, with a novel iterative refinement attack achieving 100% success in reintroducing vulnerabilities across real projects.
Can Language Models Go Beyond Coding? Assessing the Capability of Language Models to Build Real-World Systems cs.SE · 2025-11-02 · unverdicted · none · ref 34
Build-bench is the first architecture-aware benchmark that evaluates LLMs on repairing cross-ISA build failures via iterative tool-augmented reasoning, with the best model reaching 63.19% success.
Do AI Models Dream of Faster Code? An Empirical Study on LLM-Proposed Performance Improvements in Real-World Software cs.SE · 2025-10-17 · unverdicted · none · ref 46
LLMs propose volatile performance improvements on real-world Java tasks that lag human developers on average, showing algorithmic benchmarks overestimate capabilities.
ContractEval: A Benchmark for Evaluating Contract-Satisfying Assertions in Code Generation cs.AI · 2025-10-14 · unverdicted · none · ref 18
ContractEval benchmark on 364 tasks shows code LLMs achieve 75-82% functional pass@1 but 0% contract satisfaction under standard prompting, rising only to 23-41% with explicit contracts.
ML Code Smells: From Specification to Detection cs.SE · 2025-09-24 · unverdicted · none · ref 11
SpecDetect4ML detects 22 ML code smells via DSL specifications and CPG-based analysis, reporting 95.82% precision and 88.14% recall on 890 ML systems while outperforming prior tools.
CodeCureAgent: Automatic Classification and Repair of Static Analysis Warnings cs.SE · 2025-09-15 · conditional · none · ref 23
CodeCureAgent achieves 96.8% plausible fixes and 86.3% correct fixes for 1,000 SonarQube warnings across 106 Java projects using an agentic LLM framework.
Once4All: Skeleton-Guided SMT Solver Fuzzing with LLM-Synthesized Generators cs.SE · 2025-08-28 · conditional · none · ref 20 · 2 links
Once4All synthesizes LLM-based generators from extracted SMT grammars and populates formula skeletons to fuzz Z3 and cvc5, discovering 43 confirmed bugs with 40 fixed.
Guidelines for Empirical Studies in Software Engineering involving Large Language Models cs.SE · 2025-08-21 · accept · none · ref 67 · 2 links
The paper delivers a taxonomy of seven LLM study types in software engineering along with eight guidelines that separate mandatory requirements from recommended practices to address reproducibility challenges.
Efficient Black-Box Fault Localization for System-Level Test Code Using Large Language Models cs.SE · 2025-06-23 · unverdicted · none · ref 104
A black-box LLM approach for fault localization in system-level test code that estimates execution traces from failure logs to rank potential faults with reduced inference cost.
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution cs.SE · 2025-02-25 · unverdicted · none · ref 56
SWE-RL uses RL on software evolution data to train LLMs achieving 41% on SWE-bench Verified with generalization to other reasoning tasks.
NESA: Relational Neuro-Symbolic Static Program Analysis cs.PL · 2024-12-18 · conditional · none · ref 56
NESA presents a neuro-symbolic framework that decomposes static analyses into policy-defined sub-problems solved by parsers and LLMs to enable compilation-free customizable analysis with reduced hallucinations.
Direction for Detection: A Survey of Automated Vulnerability Detection and all of its Pain Points cs.SE · 2024-12-15 · conditional · none · ref 79
ML4AVD research remains locked into binary function-level classification of C/C++ vulnerabilities because twelve pain points in the pipeline reinforce each other through feedback loops.
Three Heads Are Better Than One: A Multi-perspective Reasoning Framework for Enhanced Vulnerability Detection cs.SE · 2026-05-18 · conditional · none · ref 50
ReasonVul deploys three LLM agents with independent analysis and structured debate to achieve 40% PairAcc and 72.52% F1 on PrimeVul, outperforming baselines by 81% in PairAcc.
Task Abstention for Large Language Models in Code Generation cs.SE · 2026-05-16 · unverdicted · none · ref 12
A distribution-free abstention rule grounded in multiple hypothesis testing uses execution consistency to let code LLMs avoid hallucination-prone tasks with theoretical guarantees.
Code-Centric Detection of Vulnerability-Fixing Commits: A Unified Benchmark and Empirical Study cs.SE · 2026-05-13 · accept · none · ref 43 · 3 links
Code language models show no transferable security understanding from code diffs alone, rely on commit messages, miss over 93% of fixes at 0.5% false positive rate, and suffer large drops under group or temporal splits.
BoostAPR: Boosting Automated Program Repair via Execution-Grounded Reinforcement Learning with Dual Reward Models cs.AI · 2026-05-09 · unverdicted · none · ref 85 · 2 links
BoostAPR boosts automated program repair by training a sequence-level assessor and line-level credit allocator from execution outcomes, then applying them in PPO to reach 40.7% on SWE-bench Verified.
Similar Pattern Annotation via Retrieval Knowledge for LLM-Based Test Code Fault Localization cs.SE · 2026-05-08 · unverdicted · none · ref 48
SPARK improves LLM-based test code fault localization by retrieving similar past faults and selectively annotating suspicious lines in new failing tests.
Reproduction Test Generation for Java SWE Issues cs.SE · 2026-05-05 · unverdicted · none · ref 18 · 2 links
Introduces the first benchmark for Java reproduction test generation from repository issues and adapts a prior Python tool to produce high performance on it.
SAGE: Signal-Amplified Guided Embeddings for LLM-based Vulnerability Detection cs.CR · 2026-04-21 · unverdicted · none · ref 11
SAGE uses sparse autoencoders to boost vulnerability signals in LLMs, raising internal SNR 12.7x and delivering up to 318% MCC gains on vulnerability detection benchmarks.
SOCIA-EVO: Automated Simulator Construction via Dual-Anchored Bi-Level Optimization cs.AI · 2026-04-19 · unverdicted · none · ref 112
SOCIA-EVO generates statistically consistent simulators by separating structural refinement from parameter calibration via bi-level optimization and falsifying strategies through execution feedback in a Bayesian-weighted playbook.
AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection cs.SE · 2026-04-13 · conditional · none · ref 27
AnyPoC introduces a multi-agent system for generating and validating PoC tests from LLM bug reports, producing 1.3x more valid PoCs, rejecting 9.8x more false positives, and discovering 122 new bugs across 12 major projects.
Enhancing Program Repair with Specification Guidance and Intermediate Behavioral Signals cs.SE · 2026-04-13 · unverdicted · none · ref 11 · 5 links
SpecTune improves LLM-based automated program repair by deriving localized postconditions at execution checkpoints and using alpha and beta signals to produce precise fault-localization and patch-generation guidance.
Beyond Crash-to-Patch: Patch Evolution for Linux Kernel Repair cs.SE · 2026-04-04 · unverdicted · none · ref 8 · 2 links
Reconstructing 6946 syzbot bug-fix lifecycles reveals that accepted kernel patches are non-local and reviewer-constrained, enabling PatchAdvisor to improve automated repair quality over baselines via retrieval and diagnostic guidance.
Towards Predicting Multi-Vulnerability Attack Chains in Software Supply Chains from Software Bill of Materials Graphs cs.SE · 2026-04-04 · unverdicted · none · ref 31
The paper shows that heterogeneous graph attention networks can classify vulnerable components in real SBOMs at 91% accuracy and that a simple MLP can predict documented multi-vulnerability chains with 0.93 ROC-AUC.
PAFT: Preservation Aware Fine-Tuning for Minimal-Edit Program Repair cs.SE · 2026-04-03 · unverdicted · none · ref 16 · 2 links
PAFT improves LLM-based program repair pass rates by up to 65.6% while cutting average edit distance by up to 32.6% through explicit preservation signals and curriculum training.
Sustainability Analysis of Prompt Strategies for SLM-based Automated Test Generation cs.SE · 2026-04-03 · unverdicted · none · ref 9
Prompt strategies for SLM-based automated test generation vary widely in energy consumption and carbon emissions, with simpler strategies delivering competitive coverage at markedly lower environmental cost.
EditFlow: Benchmarking and Optimizing Code Edit Recommendation Systems via Reconstruction of Developer Flows cs.SE · 2026-02-25 · unverdicted · none · ref 17
EditFlow reconstructs temporal developer editing flows from code changes to benchmark and optimize AI code edit recommenders so they align with natural incremental reasoning rather than static snapshots.
Challenges in Android Data Disclosure: An Empirical Study cs.SE · 2026-01-28 · unverdicted · none · ref 58
Survey and forum analysis of 683 Android developers finds they manually classify app data for Google's Data Safety Section or skip it, feel confident spotting collected data but not in translating it to the form, and worry about rejection.
PRAXIS: Integrating Program Analysis with Observability for Root-Cause Analysis cs.DC · 2025-12-26 · unverdicted · none · ref 34
PRAXIS combines LLM-driven structured traversal of service dependency graphs and hammock-block program dependence graphs to improve root-cause analysis accuracy by up to 6.3x while cutting token consumption by 5.3x on 30 real-world cloud incidents.
Knowledge-Graph-Driven Data Synthesis for Low-Resource Software Development: A HarmonyOS Case Study cs.SE · 2025-11-29 · unverdicted · none · ref 30
APIKG4Syn synthesizes API-oriented training data via knowledge graphs and Monte Carlo search to fine-tune a 7B model that reaches 25% pass@1 on HarmonyOS code generation, beating untuned GPT-4o at 17.59%.
Project-Level C-to-Rust Translation via Pointer Knowledge Graphs cs.SE · 2025-10-13 · unverdicted · none · ref 53
PtrTrans builds a Pointer Knowledge Graph with points-to flows, struct abstractions, and Rust annotations to guide LLMs toward project-level C-to-Rust translations that cut unsafe code by 99.9% and raise functional correctness by 29.3%.

In 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023, Melbourne, Australia, May 14-20

hub tools

citation-role summary

citation-polarity summary

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer