Claude Sonnet 4.5 exhibits functional emotions via abstract internal representations of emotion concepts that causally influence its preferences and misaligned behaviors without implying subjective experience.
Emotions where art thou: Understand- ing and characterizing the emotional latent space of large language models
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
A 2x2 factorial experiment on Qwen3.5-4B shows that relational structure and first-person register interact to drive behavioral persistence after functional collapse, while attention tracks lexical surprise and emotion probes track structure alone.
Language model embeddings encode a globally organized, navigable manifold corresponding to a consciousness-spectrum taxonomy, with trajectories moving from lower- to higher-level regions.
Open-weight instruction-aware encoders capture equal or greater affective information than proprietary models at word level across emotion theories, while task-tuned and proprietary encoders perform best on sentence-level classification.
citing papers explorer
-
Emotion Concepts and their Function in a Large Language Model
Claude Sonnet 4.5 exhibits functional emotions via abstract internal representations of emotion concepts that causally influence its preferences and misaligned behaviors without implying subjective experience.