pith:XJKCQFUN
DialToM: A Theory of Mind Benchmark for Forecasting State-Driven Dialogue Trajectories
LLMs identify mental states in dialogue but mostly fail to forecast how conversations will unfold from those states.
arxiv:2604.20443 v2 · 2026-04-22 · cs.CL · cs.AI · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{XJKCQFUNOB5CIO5DJEXB6LWASP}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
while LLMs excel at identifying mental states, most (except for Gemini 3 Pro) fail to leverage this understanding to forecast social trajectories. Additionally, we find only weak semantic similarities between human and LLM-generated inferences.
The multiple-choice forecasting task and human verification process truly isolate functional use of mental states rather than surface patterns or dataset artifacts.
LLMs identify mental states in dialogues well but mostly fail to forecast state-consistent future trajectories, except Gemini 3 Pro, with only weak overlap to human inferences.
Receipt and verification
| First computed | 2026-05-29T01:05:10.263651Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
ba5428168d707a243ba3492e1f2ec093d67ed54a8bdd7030464ea35c0e606628
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/XJKCQFUNOB5CIO5DJEXB6LWASP \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: ba5428168d707a243ba3492e1f2ec093d67ed54a8bdd7030464ea35c0e606628
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "a25d475a8c8736d531719bc2b874818b43bdeba61a4bc96432f36c6330245e50",
"cross_cats_sorted": [
"cs.AI",
"cs.LG"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CL",
"submitted_at": "2026-04-22T11:07:46Z",
"title_canon_sha256": "2e57dfb93a3f43f4336b19bb1018e135f22eb1596f1a075b039054fad6ab05ba"
},
"schema_version": "1.0",
"source": {
"id": "2604.20443",
"kind": "arxiv",
"version": 2
}
}