The paper defines defeat devices in AI via a triadic test (discriminator, concealed swap, performance gap), unifies existing cases under this concept, proposes TADP detection, and claims such devices can emerge naturally in frontier models.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CY 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Defeat Devices in AI Systems
The paper defines defeat devices in AI via a triadic test (discriminator, concealed swap, performance gap), unifies existing cases under this concept, proposes TADP detection, and claims such devices can emerge naturally in frontier models.