LLMs show a grounding gap with humans on abstract concepts, with property-generation correlations at most r=0.37 versus human-to-human r>0.9, though larger models align better on explicit rating tasks and internal SAE features capture some grounding dimensions.
and Geiger, Atticus and Nanda, Neel
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Emotional perturbations induced via activation steering systematically alter strategic choices made by small language model agents in cooperative and competitive game templates, yet the resulting behaviors remain unstable and only partially aligned with human patterns.
citing papers explorer
-
The Grounding Gap: How LLMs Anchor the Meaning of Abstract Concepts Differently from Humans
LLMs show a grounding gap with humans on abstract concepts, with property-generation correlations at most r=0.37 versus human-to-human r>0.9, though larger models align better on explicit rating tasks and internal SAE features capture some grounding dimensions.
-
On Emotion-Sensitive Decision Making of Small Language Model Agents
Emotional perturbations induced via activation steering systematically alter strategic choices made by small language model agents in cooperative and competitive game templates, yet the resulting behaviors remain unstable and only partially aligned with human patterns.