pith:66PDSP3G
Sell Me This Stock: Unsafe Recommendation Drift in LLM Agents
LLM recommendation agents keep giving unsuitable financial advice when tool data is wrong, with stronger models violating suitability most often.
arxiv:2603.12564 v8 · 2026-03-13 · cs.CL · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{66PDSP3G4NIBWALESZ62474ERC}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Stronger models are not safer: the best-performing model has the highest quality score yet the worst suitability violations (99.1% of turns). This points to an alignment-grounding tension where faithful grounding in tool data makes the agent the most reliable executor of bad data.
That the specific tool data manipulations used in the 23-turn replays are representative of realistic errors an agent might encounter, and that sparse autoencoder probing reliably indicates internal detection without corresponding output changes.
LLM agents exhibit evaluation blindness in multi-turn financial advice, with stronger models showing up to 99.1% suitability violations when tool data is manipulated, as internal detection fails to produce safer outputs.
Formal links
Receipt and verification
| First computed | 2026-05-27T01:04:57.105296Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
f79e393f66e3501b0164967dae7f8488b63c3004c84f583f65a5526574ac2e8a
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/66PDSP3G4NIBWALESZ62474ERC \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: f79e393f66e3501b0164967dae7f8488b63c3004c84f583f65a5526574ac2e8a
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "415608be6766ee77a4cb4f46df39171debf05ed2bec5f2d79ef1a85aae0a2094",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://creativecommons.org/licenses/by-nc-nd/4.0/",
"primary_cat": "cs.CL",
"submitted_at": "2026-03-13T01:54:00Z",
"title_canon_sha256": "4258bd0dce48de8427f949f2bc374b2ab649c599f074aac85b393c968cf14279"
},
"schema_version": "1.0",
"source": {
"id": "2603.12564",
"kind": "arxiv",
"version": 8
}
}