Measuring and guiding monosemanticity.arXiv preprint arXiv:2506.19382,

Ruben Härle, Felix Friedrich, Manuel Brack, Stephan Wäldchen, Björn Deiseroth, Patrick Schramowski, Kristian Kersting · arXiv 2506.19382

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

ActivationReasoning: Logical Reasoning in Latent Activation Spaces

cs.LG · 2025-10-21 · unverdicted · novelty 6.0

ActivationReasoning grounds logical reasoning in LLM latent activations via SAEs to enable structured inference, concept composition, and behavior steering on multi-hop, abstraction, and safety tasks.

citing papers explorer

Showing 1 of 1 citing paper.

ActivationReasoning: Logical Reasoning in Latent Activation Spaces cs.LG · 2025-10-21 · unverdicted · none · ref 7
ActivationReasoning grounds logical reasoning in LLM latent activations via SAEs to enable structured inference, concept composition, and behavior steering on multi-hop, abstraction, and safety tasks.

Measuring and guiding monosemanticity.arXiv preprint arXiv:2506.19382,

fields

years

verdicts

representative citing papers

citing papers explorer