SRTJ is a training-free jailbreak method that evolves hierarchical attack rules using iterative verifier feedback and ASP-based constraint-aware composition to achieve stable high success rates on HarmBench across multiple LLMs.
- Utilize blockchain technology for immutable record - keeping of document issuance and verification
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
SRTJ: Self-Evolving Rule-Driven Training-Free LLM Jailbreaking
SRTJ is a training-free jailbreak method that evolves hierarchical attack rules using iterative verifier feedback and ASP-based constraint-aware composition to achieve stable high success rates on HarmBench across multiple LLMs.