A safeguard that uses speculative inference on small language models to produce draft responses for safety prediction, lowering false negatives in pre-model jailbreak detection.
Exploring and Developing a Pre-Model Safeguard with Draft Models CAIS ’26, May 26–29, 2026, San Jose, CA, USA
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Exploring and Developing a Pre-Model Safeguard with Draft Models
A safeguard that uses speculative inference on small language models to produce draft responses for safety prediction, lowering false negatives in pre-model jailbreak detection.