MusicRFM discovers interpretable concept directions in music model hidden states using RFM probes and injects them at inference to steer generation toward desired musical properties without retraining.
Haven Kim, Zachary Novack, Weihan Xu, Julian McAuley, and Hao-Wen Dong
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
Hallucination information is linearly separable in Whisper activations and SAE latents; SAE steering reduces hallucination rates from 72.63% to 14.11% (small) and 86.88% to 27.33% (large-v3) on non-speech audio with small WER impact.
Activation steering with Gram-Schmidt orthogonalization enables disentangled, deterministic control of pitch and duration attributes in the Multitrack Music Transformer without retraining.
citing papers explorer
-
Steering Autoregressive Music Generation with Recursive Feature Machines
MusicRFM discovers interpretable concept directions in music model hidden states using RFM probes and injects them at inference to steer generation toward desired musical properties without retraining.
-
Whisper Hallucination Detection and Mitigation via Hidden Representation Steering and Sparse AutoEncoders
Hallucination information is linearly separable in Whisper activations and SAE latents; SAE steering reduces hallucination rates from 72.63% to 14.11% (small) and 86.88% to 27.33% (large-v3) on non-speech audio with small WER impact.
-
Latent Space Disentanglement via Activation Steering for Interpretable Attribute Control in Symbolic Music Generation
Activation steering with Gram-Schmidt orthogonalization enables disentangled, deterministic control of pitch and duration attributes in the Multitrack Music Transformer without retraining.