Big Five personality traits become decodable early in LLMs, are represented by mid-layer selective neurons, and can be shifted by targeted activation interventions, though effects on generated labels are weaker and spill across traits.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Psychological Concept Neurons: Can Neural Control Bias Probing and Shift Generation in LLMs?
Big Five personality traits become decodable early in LLMs, are represented by mid-layer selective neurons, and can be shifted by targeted activation interventions, though effects on generated labels are weaker and spill across traits.