PerfOrch is a four-agent multi-LLM system that uses offline profiling to build language-and-category rankings for routing tasks, achieving 97.19% and 95.83% pass@1 on HumanEval-X and EffiBench-X with generalization across benchmarks.
Vulnerabilities in AI code generators: Exploring targeted data poisoning attacks,
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
CodeXHug is a curated dataset of 7,325 HuggingFace PTMs and 20,545 Python files from GitHub, demonstrated via statistical analysis and clustering to extract code usage patterns.
Context-based adversarial attacks raise vulnerable code generation in models like GPT-4 and CodeLlama from 3.5% to 37.4%, with 60-100% transferability, and a dual-layer defense reaches 89.1% detection at low false positives.
Empirical evaluation shows that code generated by all seven tested LLMs contains vulnerabilities, the majority of critical or high severity.
A rapid review of fairness in LLM-enabled multi-agent systems for the software development lifecycle concludes that the field lacks standardized evaluations, broad coverage, and effective governance, leaving it unprepared for deployable fair systems.
citing papers explorer
-
Generate with CodeXHug: A Dataset to Enhance Model Cards with Code Usage Patterns
CodeXHug is a curated dataset of 7,325 HuggingFace PTMs and 20,545 Python files from GitHub, demonstrated via statistical analysis and clustering to extract code usage patterns.
-
Context-Based Adversarial Attacks on AI Code Generators: Vulnerability Analysis and Implications
Context-based adversarial attacks raise vulnerable code generation in models like GPT-4 and CodeLlama from 3.5% to 37.4%, with 60-100% transferability, and a dual-layer defense reaches 89.1% detection at low false positives.
-
Security of LLM-generated Code: A Comparative Analysis
Empirical evaluation shows that code generated by all seven tested LLMs contains vulnerabilities, the majority of critical or high severity.
-
Fairness in Multi-Agent Systems for Software Engineering: An SDLC-Oriented Rapid Review
A rapid review of fairness in LLM-enabled multi-agent systems for the software development lifecycle concludes that the field lacks standardized evaluations, broad coverage, and effective governance, leaving it unprepared for deployable fair systems.