pith. sign in

arXiv preprint arXiv:2402.02392 , year=

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

years

2026 5 2025 2

roles

background 2

polarities

background 2

representative citing papers

Understanding the Mechanism of Altruism in Large Language Models

econ.GN · 2026-04-21 · unverdicted · novelty 6.0

A small set of sparse autoencoder features in LLMs drives shifts between generous and selfish allocations in dictator games, with causal patching and steering confirming their role and generalization to other social games.

citing papers explorer

Showing 7 of 7 citing papers.