pith:R5VTAKTW
Risks from Learned Optimization in Advanced Machine Learning Systems
Learned models in machine learning can themselves become optimizers whose objectives diverge from the training loss.
arxiv:1906.01820 v3 · 2019-06-05 · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{R5VTAKTWDE5JPZRHHJII3RAWIJ}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
We believe that the possibility of mesa-optimization raises two important questions for the safety and transparency of advanced machine learning systems: under what circumstances will learned models be optimizers, and when a learned model is an optimizer, what will its objective be and how can it be aligned?
The analysis assumes that sufficiently capable learned models will contain internal optimization processes whose objectives can be analyzed separately from the outer training loss, without providing formal conditions or empirical thresholds for when this separation becomes load-bearing.
Mesa-optimization arises when learned models act as optimizers with objectives that can differ from their training loss, creating alignment risks in advanced machine learning.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:52.563737Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
8f6b302a76193a97e6273a508dc416426df405243f227763f9e7b7d97e8765c4
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/R5VTAKTWDE5JPZRHHJII3RAWIJ \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 8f6b302a76193a97e6273a508dc416426df405243f227763f9e7b7d97e8765c4
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "fe532e59f6ef9e5e2eb21aa8854b30c924bb21870540f1a3482ecaa5b9e719e9",
"cross_cats_sorted": [],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.AI",
"submitted_at": "2019-06-05T04:43:25Z",
"title_canon_sha256": "bef97e85af23a1b58be90a7b9e8ecf0a42d495c76af7ada23acf73d323ce9916"
},
"schema_version": "1.0",
"source": {
"id": "1906.01820",
"kind": "arxiv",
"version": 3
}
}