Ring-a-bell! how reliable are concept removal methods for diffusion models? ArXiv, abs/2310.10012

Yu-Lin Tsai, Chia-Yi Hsu, Chulin Xie, Chih-Hsun Lin, Jia-You Chen, Bo Li, Pin-Yu Chen, Chia-Mu Yu, Chun ying Huang · 2023 · arXiv 2310.10012

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

What Concepts Lie Within? Detecting and Suppressing Risky Content in Diffusion Transformers

cs.CV · 2026-05-11 · unverdicted · novelty 7.0

A method using attention head vectors detects and suppresses risky content generation in Diffusion Transformers at inference time.

TrajShield: Trajectory-Level Safety Mediation for Defending Text-to-Video Models Against Jailbreak Attacks

cs.CV · 2026-05-03 · unverdicted · novelty 7.0

TrajShield is a training-free defense that reduces jailbreak success rates by 52.44% on average in text-to-video models by localizing and neutralizing risks through trajectory simulation and causal intervention.

Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration

cs.CV · 2026-04-17 · unverdicted · novelty 6.0

TICoE achieves more precise and faithful concept erasure in text-to-image models by collaborating text and image data through a convex manifold and hierarchical learning, outperforming prior methods.

EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure

cs.CV · 2026-04-10 · unverdicted · novelty 6.0

EGLOCE erases target concepts in diffusion models at inference time by optimizing latents with dual energy guidance that repels unwanted concepts while retaining prompt alignment.

SPOT: Selective Prompt Projection via Total Variation for Inference-Only Safe Text-to-Image Generation

cs.AI · 2026-01-31 · unverdicted · novelty 6.0

SPOT projects prompts to a tau-safe set via total variation to cut inappropriate content 14-44% relative to baselines while preserving benign prompt behavior in frozen T2I models.

Empty SPACE: Cross-Attention Sparsity for Concept Erasure in Diffusion Models

cs.LG · 2026-05-11 · unverdicted · novelty 5.0

SPACE induces sparsity in cross-attention parameters via closed-form iterative updates to erase target concepts more effectively than dense baselines in large diffusion models.

FlowErase-RL: Rethinking Concept Erasure as Reward Optimization in Flow Matching Models

cs.CV · 2026-05-19

citing papers explorer

Showing 7 of 7 citing papers.

What Concepts Lie Within? Detecting and Suppressing Risky Content in Diffusion Transformers cs.CV · 2026-05-11 · unverdicted · none · ref 38
A method using attention head vectors detects and suppresses risky content generation in Diffusion Transformers at inference time.
TrajShield: Trajectory-Level Safety Mediation for Defending Text-to-Video Models Against Jailbreak Attacks cs.CV · 2026-05-03 · unverdicted · none · ref 37
TrajShield is a training-free defense that reduces jailbreak success rates by 52.44% on average in text-to-video models by localizing and neutralizing risks through trajectory simulation and causal intervention.
Beyond Text Prompts: Precise Concept Erasure through Text-Image Collaboration cs.CV · 2026-04-17 · unverdicted · none · ref 40
TICoE achieves more precise and faithful concept erasure in text-to-image models by collaborating text and image data through a convex manifold and hierarchical learning, outperforming prior methods.
EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure cs.CV · 2026-04-10 · unverdicted · none · ref 43
EGLOCE erases target concepts in diffusion models at inference time by optimizing latents with dual energy guidance that repels unwanted concepts while retaining prompt alignment.
SPOT: Selective Prompt Projection via Total Variation for Inference-Only Safe Text-to-Image Generation cs.AI · 2026-01-31 · unverdicted · none · ref 11
SPOT projects prompts to a tau-safe set via total variation to cut inappropriate content 14-44% relative to baselines while preserving benign prompt behavior in frozen T2I models.
Empty SPACE: Cross-Attention Sparsity for Concept Erasure in Diffusion Models cs.LG · 2026-05-11 · unverdicted · none · ref 27
SPACE induces sparsity in cross-attention parameters via closed-form iterative updates to erase target concepts more effectively than dense baselines in large diffusion models.
FlowErase-RL: Rethinking Concept Erasure as Reward Optimization in Flow Matching Models cs.CV · 2026-05-19 · unreviewed · ref 44

Ring-a-bell! how reliable are concept removal methods for diffusion models? ArXiv, abs/2310.10012

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer