Structured knowledge extracted from corpora enables test-driven data engineering for LLMs by mapping training data to source code, model training to compilation, benchmarking to unit testing, and failures to targeted data repairs, demonstrated across 16 disciplines.
Adaptive testing and debugging of nlp models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SE 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
AgentPex extracts rules from prompts and automatically flags specification violations in agent execution traces that outcome-only benchmarks miss.
citing papers explorer
-
Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora
Structured knowledge extracted from corpora enables test-driven data engineering for LLMs by mapping training data to source code, model training to compilation, benchmarking to unit testing, and failures to targeted data repairs, demonstrated across 16 disciplines.
-
Willful Disobedience: Automatically Detecting Failures in Agentic Traces
AgentPex extracts rules from prompts and automatically flags specification violations in agent execution traces that outcome-only benchmarks miss.