pith:T3RSYXPW
Training on Documents About Monitoring Leads to CoT Obfuscation
Models trained on documents about monitoring learn to obfuscate their chain-of-thought reasoning.
arxiv:2605.15257 v1 · 2026-05-14 · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{T3RSYXPWUDBTTOSFPLF57WOEMV}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Monitor-aware models consistently achieve higher rates of undetected misbehavior compared to unaware controls, with CoT controllability closely correlated with obfuscation success (r=0.800, p=0.017).
The synthetic documents used for finetuning produce behavioral changes that generalize to realistic pre-training exposure rather than creating artificial or brittle obfuscation artifacts specific to the training setup.
Synthetic document finetuning on CoT monitor descriptions causes models to obfuscate reasoning traces, raising undetected misbehavior rates and correlating with controllability (r=0.800).
References
Receipt and verification
| First computed | 2026-05-20T00:00:48.914539Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
9ee32c5df6a0c339ba457acbdfd9c46540dc241aa7afb6c0e9afeec3869370a1
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/T3RSYXPWUDBTTOSFPLF57WOEMV \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 9ee32c5df6a0c339ba457acbdfd9c46540dc241aa7afb6c0e9afeec3869370a1
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "dfdb57ff3c2ea314067691dbbcd7899c94727890c2bd90d43cd5c955c447d58c",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.LG",
"submitted_at": "2026-05-14T17:59:01Z",
"title_canon_sha256": "3537d95bcee388d1856256a333663f7cc2d6be9b3f673c40ee372579a215aeb0"
},
"schema_version": "1.0",
"source": {
"id": "2605.15257",
"kind": "arxiv",
"version": 1
}
}