Constrained Diffusion for Code (CDC) integrates constraint satisfaction into the reverse denoising process of discrete diffusion models via constraint-aware operators that use optimization and program analysis to steer generation toward feasible programs.
In: Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
SpecDetect4ML detects 22 ML code smells via DSL specifications and CPG-based analysis, reporting 95.82% precision and 88.14% recall on 890 ML systems while outperforming prior tools.
The paper delivers a taxonomy of seven LLM study types in software engineering along with eight guidelines that separate mandatory requirements from recommended practices to address reproducibility challenges.
A within-subject study of 12 developers found that security training reduced validated weaknesses by 31.5% and critical issues by 79.2% in LLM-assisted backend coding.
Longitudinal analysis of over 4000 toggle events in Kubernetes and GitLab shows removals lag additions, leading to growing inventories with median lifespans of 734 and 185 days, plus a benchmarking framework with five metrics.
Introduces a taxonomy of nine LLM code smells, a static detection tool, and reports 73.5% prevalence with 91.3% precision and 71.8% recall across 692 projects.
BPE tokenization creates gibberish bias in CLLMs, causing secrets with high character entropy but low token entropy to be preferentially memorized due to training data distribution shifts.
Forum discussions highlight four security concerns with GitHub Copilot: data leakage, code licensing problems, adversarial attacks such as prompt injection, and generation of insecure code.
citing papers explorer
-
Constrained Code Generation with Discrete Diffusion
Constrained Diffusion for Code (CDC) integrates constraint satisfaction into the reverse denoising process of discrete diffusion models via constraint-aware operators that use optimization and program analysis to steer generation toward feasible programs.
-
ML Code Smells: From Specification to Detection
SpecDetect4ML detects 22 ML code smells via DSL specifications and CPG-based analysis, reporting 95.82% precision and 88.14% recall on 890 ML systems while outperforming prior tools.
-
Guidelines for Empirical Studies in Software Engineering involving Large Language Models
The paper delivers a taxonomy of seven LLM study types in software engineering along with eight guidelines that separate mandatory requirements from recommended practices to address reproducibility challenges.
-
A Quasi-Experimental Developer Study of Security Training in LLM-Assisted Web Application Development
A within-subject study of 12 developers found that security training reduced validated weaknesses by 31.5% and critical issues by 79.2% in LLM-assisted backend coding.
-
Feature Toggle Dynamics in Large-Scale Systems: Prevalence, Growth, Lifespan, and Benchmarking
Longitudinal analysis of over 4000 toggle events in Kubernetes and GitLab shows removals lag additions, leading to growing inventories with median lifespans of 734 and 185 days, plus a benchmarking framework with five metrics.
-
LLM Code Smells: A Taxonomy and Detection Approach
Introduces a taxonomy of nine LLM code smells, a static detection tool, and reports 73.5% prevalence with 91.3% precision and 71.8% recall across 692 projects.
-
Understanding Secret Leakage Risks in Code LLMs: A Tokenization Perspective
BPE tokenization creates gibberish bias in CLLMs, causing secrets with high character entropy but low token entropy to be preferentially memorized due to training data distribution shifts.
-
Security Concerns in Generative AI Coding Assistants: Insights from Online Discussions on GitHub Copilot
Forum discussions highlight four security concerns with GitHub Copilot: data leakage, code licensing problems, adversarial attacks such as prompt injection, and generation of insecure code.