pith:5FK7HQTN
Tracing Persona Vectors Through LLM Pretraining
Persona vectors for traits like sycophancy emerge within the first 0.22 percent of LLM pretraining and remain usable for steering the final model.
arxiv:2605.13329 v1 · 2026-05-13 · cs.CL · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{5FK7HQTNIH2TDBCPZTCZTHNHAH}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
persona vectors form remarkably early -- within 0.22% of OLMo-3 pretraining -- and remain effective for steering the fully post-trained instruct models. Although core representations are formed early on, persona vectors continue to refine geometrically and semantically throughout pretraining.
That the linear directions identified at early checkpoints represent the same high-level personas as those in the final model and that the elicitation methods isolate these without being confounded by other training dynamics.
Persona vectors form within the first 0.22% of LLM pretraining and remain effective for steering post-trained models, with continued refinement and transfer to other models.
References
Receipt and verification
| First computed | 2026-05-18T02:44:48.576627Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
e955f3c26d41f531844fccc5999da701cfadd98e7ef96b2818fb79257165c9fc
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/5FK7HQTNIH2TDBCPZTCZTHNHAH \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: e955f3c26d41f531844fccc5999da701cfadd98e7ef96b2818fb79257165c9fc
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "c7f411b96e19481262bd472de5e00f7cd675e062a5443b4bf5b4fd7f7272b524",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://creativecommons.org/licenses/by-sa/4.0/",
"primary_cat": "cs.CL",
"submitted_at": "2026-05-13T10:44:23Z",
"title_canon_sha256": "553b4349d96b595ca66f5501ab112f96fb59c45acf7fcdaa8a02da0a430473a7"
},
"schema_version": "1.0",
"source": {
"id": "2605.13329",
"kind": "arxiv",
"version": 1
}
}