pith:AKF4OPQB
Tracing Moral Foundations in Large Language Models
Large language models develop internal representations of moral foundations that align with human judgments and emerge naturally during pretraining.
arxiv:2601.05437 v3 · 2026-01-09 · cs.CL · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{AKF4OPQBKR3BCAQ2NJBA3CSSZB}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Models represent and distinguish moral foundations in a manner that aligns with human judgments, and this moral geometry naturally emerges from pretraining and is selectively rewired by post-training; steering along dense vectors or sparse SAE features produces predictable shifts in foundation-relevant behavior.
That the chosen Moral Foundations Theory categories and the SAE feature extraction faithfully capture the models' internal moral concepts rather than imposing an external taxonomy or detecting spurious correlations.
Moral foundations in LLMs form distributed, layered representations that align with human perceptions, emerge from pretraining, and causally influence outputs when steered via dense vectors or sparse features.
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-20T01:05:05.847808Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
028bc73e01547611021a6a420d8a52c86b195b7e4a0a6509d5d62dbfbfa6e618
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/AKF4OPQBKR3BCAQ2NJBA3CSSZB \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 028bc73e01547611021a6a420d8a52c86b195b7e4a0a6509d5d62dbfbfa6e618
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "accb73257ce2140233708fa4eeba075f1a7a2e7f3015638fb48435f9a6b5e175",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CL",
"submitted_at": "2026-01-09T00:09:28Z",
"title_canon_sha256": "e4cfaa10f742e09c1c84a8bdd68bf0fc85dc926c52fcf0a995e66309236c203b"
},
"schema_version": "1.0",
"source": {
"id": "2601.05437",
"kind": "arxiv",
"version": 3
}
}