LLMs corrupt an average of 25% of document content during long delegated editing workflows across 52 domains, even frontier models, and agentic tools do not mitigate the issue.
arXiv preprint arXiv:2502.05352 , year=
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
Runtime-structured task decomposition reduces retry costs in agentic coding systems by up to 51.7% versus monolithic prompts by rerunning only failed subtasks on two software engineering workloads.
The central challenge in AI-augmented CI/CD is designing authority transfer from humans to agents under constraints, as current systems remain limited to bounded data-plane autonomy backed by external governance.
The paper introduces Experiment-as-Code Labs as a declarative stack synthesizing AI agents, systems orchestration, and physical lab control for AI-driven discovery.
citing papers explorer
-
LLMs Corrupt Your Documents When You Delegate
LLMs corrupt an average of 25% of document content during long delegated editing workflows across 52 domains, even frontier models, and agentic tools do not mitigate the issue.
-
Runtime-Structured Task Decomposition for Agentic Coding Systems
Runtime-structured task decomposition reduces retry costs in agentic coding systems by up to 51.7% versus monolithic prompts by rerunning only failed subtasks on two software engineering workloads.
-
From Assistance to Agency: Rethinking Autonomy and Control in CI/CD Pipelines
The central challenge in AI-augmented CI/CD is designing authority transfer from humans to agents under constraints, as current systems remain limited to bounded data-plane autonomy backed by external governance.
-
Experiment-as-Code Labs: A Declarative Stack for AI-Driven Scientific Discovery
The paper introduces Experiment-as-Code Labs as a declarative stack synthesizing AI agents, systems orchestration, and physical lab control for AI-driven discovery.