Prisonbreak: Jailbreaking large language models with fewer than twenty-five targeted bit-flips

· 2024 · arXiv 2412.07192

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

unclear 1

representative citing papers

GPUBreach: Privilege Escalation Attacks on GPUs using Rowhammer

cs.CR · 2026-05-05 · unverdicted · novelty 8.0

Unprivileged CUDA kernels can use Rowhammer to tamper with GPU page tables for targeted privilege escalation, leaking cryptographic keys and escalating to CPU root access by bypassing IOMMU.

CacheTrap: Unveiling a Stealthier Gray-Box Trojan against LLMs

cs.CR · 2025-11-27 · conditional · novelty 8.0

CacheTrap achieves 100% targeted attack success on five open-source LLMs by using an efficient search to locate and flip a single bit in the KV cache as a transient trigger, while preserving normal accuracy without the trigger.

Loaded Dice: Solving the Non-Selection Problem for Scalable Probabilistic RowHammer Defense

cs.CR · 2026-05-17 · conditional · novelty 7.0

PrISM uses a Sampled History Queue to correlate row samples across windows, solving the non-selection problem in probabilistic RowHammer mitigation and cutting slowdown from 10.7% to 1.5% at threshold 250 versus prior methods.

Jailbreaking the Matrix: Nullspace Steering for Controlled Model Subversion

cs.CR · 2026-04-11 · unverdicted · novelty 7.0

HMNS is a new jailbreak method that uses causal head identification and nullspace-constrained injection to achieve higher attack success rates than prior techniques on aligned language models.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Jailbreaking the Matrix: Nullspace Steering for Controlled Model Subversion cs.CR · 2026-04-11 · unverdicted · none · ref 12
HMNS is a new jailbreak method that uses causal head identification and nullspace-constrained injection to achieve higher attack success rates than prior techniques on aligned language models.

Prisonbreak: Jailbreaking large language models with fewer than twenty-five targeted bit-flips

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer