Hypersteer: Activation steering at scale with hypernetworks.arXiv preprint arXiv:2506.03292, 2025

Jiuding Sun, Sidharth Baskaran, Zhengxuan Wu, Michael Sklar, Christopher Potts, Atticus Geiger · 2025 · arXiv 2506.03292

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

When Is Rank-1 Steering Cheap? Geometry, Granularity, and Budgeted Search

cs.LG · 2026-05-09 · unverdicted · novelty 7.0 · 2 refs

Prompt-boundary directional alignment enables geometry-guided search that cuts trials to 95% best utility by 39.8% on average, while concept granularity predicts remaining difficulty via directional heterogeneity.

HyperTransport: Amortized Conditioning of T2I Generative Models

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

HyperTransport amortizes activation steering for T2I models via a hypernetwork that predicts intervention parameters from CLIP embeddings, delivering 3600-7000x speedup and matching per-concept baselines on 167 unseen concepts.

Steer Like the LLM: Activation Steering that Mimics Prompting

cs.CL · 2026-05-05 · unverdicted · novelty 7.0

PSR models that estimate token-specific steering coefficients from activations outperform standard activation steering and compare favorably to prompting on steering benchmarks.

What Drives Representation Steering? A Mechanistic Case Study on Steering Refusal

cs.LG · 2026-04-09 · unverdicted · novelty 6.0

Steering vectors for refusal primarily modify the OV circuit in attention, ignore most of the QK circuit, and can be sparsified to 1-10% of dimensions while retaining performance.

citing papers explorer

Showing 4 of 4 citing papers.

When Is Rank-1 Steering Cheap? Geometry, Granularity, and Budgeted Search cs.LG · 2026-05-09 · unverdicted · none · ref 15 · 2 links
Prompt-boundary directional alignment enables geometry-guided search that cuts trials to 95% best utility by 39.8% on average, while concept granularity predicts remaining difficulty via directional heterogeneity.
HyperTransport: Amortized Conditioning of T2I Generative Models cs.LG · 2026-05-07 · unverdicted · none · ref 8
HyperTransport amortizes activation steering for T2I models via a hypernetwork that predicts intervention parameters from CLIP embeddings, delivering 3600-7000x speedup and matching per-concept baselines on 167 unseen concepts.
Steer Like the LLM: Activation Steering that Mimics Prompting cs.CL · 2026-05-05 · unverdicted · none · ref 5
PSR models that estimate token-specific steering coefficients from activations outperform standard activation steering and compare favorably to prompting on steering benchmarks.
What Drives Representation Steering? A Mechanistic Case Study on Steering Refusal cs.LG · 2026-04-09 · unverdicted · none · ref 33
Steering vectors for refusal primarily modify the OV circuit in attention, ignore most of the QK circuit, and can be sparsified to 1-10% of dimensions while retaining performance.

Hypersteer: Activation steering at scale with hypernetworks.arXiv preprint arXiv:2506.03292, 2025

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer