pith. sign in

arxiv: 2503.09630 · v5 · pith:L7FY3AKUnew · submitted 2025-03-11 · 💻 cs.GR

CASteer: Cross-Attention Steering for Controllable Concept Erasure

classification 💻 cs.GR
keywords casteerconceptconceptserasuresteeringvectorsacrosscontent
0
0 comments X
read the original abstract

Diffusion models have transformed image generation, yet controlling their outputs to reliably erase undesired concepts remains challenging. Existing approaches usually require task-specific training and struggle to generalize across both concrete (e.g., objects) and abstract (e.g., styles) concepts. We propose CASteer (Cross-Attention Steering), a training-free framework for concept erasure in diffusion models using steering vectors to influence hidden representations dynamically. CASteer precomputes concept-specific steering vectors by averaging neural activations from images generated for each target concept. During inference, it dynamically applies these vectors to suppress undesired concepts only when they appear, ensuring that unrelated regions remain unaffected. This selective activation enables precise, context-aware erasure without degrading overall image quality. This approach achieves effective removal of harmful or unwanted content across a wide range of visual concepts, all without model retraining. CASteer outperforms state-of-the-art concept erasure techniques while preserving unrelated content and minimizing unintended effects.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Disentangled Anatomy-Disease Diffusion (DADD) for Controllable Ulcerative Colitis Progression Synthesis

    cs.CV 2026-05 unverdicted novelty 6.0

    DADD disentangles anatomy and disease in a latent diffusion model using a Feature Purifier, ordinal disease embeddings, and Delta Steering to synthesize controllable ulcerative colitis progression images.

  2. TADA! Tuning Audio Diffusion Models through Activation Steering

    cs.SD 2026-02 unverdicted novelty 6.0

    Activation steering at a semantic bottleneck in audio diffusion models achieves state-of-the-art control over musical attributes such as instruments, vocals, and genres.

  3. BARRIER: Bounded Activation Regions for Robust Information Erasure

    cs.CV 2026-05 unverdicted novelty 5.0

    BARRIER applies interval arithmetic to SVD-based activation projections to create bounded forget regions that enable aggressive unlearning while providing formal protection for retain distributions via tail bounds on ...