A review of 114 studies creates taxonomies for code and data quality issues, formalizes 18 propagation mechanisms from training data defects to LLM-generated code defects, and synthesizes detection and mitigation techniques.
Abdul Awal, Mrigank Rochan, and Chanchal K
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.SE 2roles
background 1polarities
background 1representative citing papers
Empirical tests show compressed code language models retain task performance but suffer markedly lower robustness under four standard adversarial attacks.
citing papers explorer
-
Bridging Generation and Training: A Systematic Review of Quality Issues in LLMs for Code
A review of 114 studies creates taxonomies for code and data quality issues, formalizes 18 propagation mechanisms from training data defects to LLM-generated code defects, and synthesizes detection and mitigation techniques.
-
Model Compression vs. Adversarial Robustness: An Empirical Study on Language Models for Code
Empirical tests show compressed code language models retain task performance but suffer markedly lower robustness under four standard adversarial attacks.