Compression acts as an adversarial amplifier by reducing the decision space of image classifiers, making attacks in compressed representations substantially more effective than pixel-space attacks under the same perturbation budget.
Low- effort jailbreak attacks against text-to-image safety filters
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
Prompt injection detection performance is highly regime-dependent with no single detector dominating across settings; transformer models perform best overall while structural signals offer modest gains in some regimes.
Multi-generation sampling from LLMs uncovers more jailbreak behaviors than single generations, with the largest gains from one to moderate sample counts and diminishing returns thereafter.
citing papers explorer
-
Compression as an Adversarial Amplifier Through Decision Space Reduction
Compression acts as an adversarial amplifier by reducing the decision space of image classifiers, making attacks in compressed representations substantially more effective than pixel-space attacks under the same perturbation budget.
-
Prompt Injection Detection is Regime-Dependent: A Deployment-Aware Evaluation with Interpretable Structural Signals
Prompt injection detection performance is highly regime-dependent with no single detector dominating across settings; transformer models perform best overall while structural signals offer modest gains in some regimes.
-
An Empirical Study of Multi-Generation Sampling for Jailbreak Detection in Large Language Models
Multi-generation sampling from LLMs uncovers more jailbreak behaviors than single generations, with the largest gains from one to moderate sample counts and diminishing returns thereafter.