pith. sign in

arXiv preprint arXiv:2405.16376 , year=

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

citation-role summary

dataset 1

citation-polarity summary

fields

econ.GN 1

years

2026 1

verdicts

UNVERDICTED 1

roles

dataset 1

polarities

use dataset 1

representative citing papers

Understanding the Mechanism of Altruism in Large Language Models

econ.GN · 2026-04-21 · unverdicted · novelty 6.0

A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.

citing papers explorer

Showing 1 of 1 citing paper.

  • Understanding the Mechanism of Altruism in Large Language Models econ.GN · 2026-04-21 · unverdicted · none · ref 78

    A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.