ABTest mines 400 failure reports into 47 patterns and 128 actions to generate 647 tests that flag 642 new anomalies across three AI coding agents at 40.8% precision.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3roles
background 1polarities
background 1representative citing papers
YoloFS is an agent-native filesystem that stages mutations for review, provides snapshots for agent self-correction, and uses progressive permissions to reduce user interruptions while matching baseline task success.
MARS coordinates heterogeneous GPU-CPU resources for agentic LLM workloads via decoupled admission control and agent-centric KV cache management, delivering up to 5.94x lower latency and 1.87x faster task completion.
citing papers explorer
-
ABTest: Behavior-Driven Testing for AI Coding Agents
ABTest mines 400 failure reports into 47 patterns and 128 actions to generate 647 tests that flag 642 new anomalies across three AI coding agents at 40.8% precision.
-
Don't Let AI Agents YOLO Your Files: Shifting Information and Control to Filesystems for Agent Safety and Autonomy
YoloFS is an agent-native filesystem that stages mutations for review, provides snapshots for agent self-correction, and uses progressive permissions to reduce user interruptions while matching baseline task success.
-
MARS: Efficient, Adaptive Co-Scheduling for Heterogeneous Agentic Systems
MARS coordinates heterogeneous GPU-CPU resources for agentic LLM workloads via decoupled admission control and agent-centric KV cache management, delivering up to 5.94x lower latency and 1.87x faster task completion.