A new framework with dueling nulls and e-processes from SAVI provides anytime-valid type-I error control for adaptive AI audits and proves that a stringent audit certifies global robustness when the auditor is sufficiently powerful.
Context-aware testing: A new paradigm for model testing with large language models
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Adaptive auditing of AI systems with anytime-valid guarantees
A new framework with dueling nulls and e-processes from SAVI provides anytime-valid type-I error control for adaptive AI audits and proves that a stringent audit certifies global robustness when the auditor is sufficiently powerful.