A safeguard that uses speculative inference on small language models to produce draft responses for safety prediction, lowering false negatives in pre-model jailbreak detection.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
CausalSynth combines structural causal models with LLMs and iterative verification to produce synthetic data that respects given causal structures while remaining linguistically natural.
citing papers explorer
-
Exploring and Developing a Pre-Model Safeguard with Draft Models
A safeguard that uses speculative inference on small language models to produce draft responses for safety prediction, lowering false negatives in pre-model jailbreak detection.
-
CasualSynth: Generating Structurally Sound Synthetic Data
CausalSynth combines structural causal models with LLMs and iterative verification to produce synthetic data that respects given causal structures while remaining linguistically natural.