LRS trains a latent reward model on final-answer correctness to steer SAE states during inference, improving reasoning performance and implicitly encouraging better cognitive behaviors.
To steer or not to steer? mechanistic error reduction with abstention for language models
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
support 1representative citing papers
Steer-to-Detect learns a steering vector injected into LLM hidden states to boost class separability and applies hypothesis testing with finite-sample Type I/II error guarantees for generated-text detection.
The paper proposes a bidirectional continuum between LLMs and control systems, covering LLM-assisted controller design, control-based LLM steering, and state-space modeling of LLMs.
citing papers explorer
-
When control meets large language models: From words to dynamics
The paper proposes a bidirectional continuum between LLMs and control systems, covering LLM-assisted controller design, control-based LLM steering, and state-space modeling of LLMs.