pith. sign in
Pith Number

pith:XJKCQFUN

pith:2026:XJKCQFUNOB5CIO5DJEXB6LWASP
not attested not anchored not stored refs pending

DialToM: A Theory of Mind Benchmark for Forecasting State-Driven Dialogue Trajectories

Ee-Peng Lim, Jing Jiang, Neemesh Yadav, Palakorn Achananuparp

LLMs identify mental states in dialogue but mostly fail to forecast how conversations will unfold from those states.

arxiv:2604.20443 v2 · 2026-04-22 · cs.CL · cs.AI · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{XJKCQFUNOB5CIO5DJEXB6LWASP}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

while LLMs excel at identifying mental states, most (except for Gemini 3 Pro) fail to leverage this understanding to forecast social trajectories. Additionally, we find only weak semantic similarities between human and LLM-generated inferences.

C2weakest assumption

The multiple-choice forecasting task and human verification process truly isolate functional use of mental states rather than surface patterns or dataset artifacts.

C3one line summary

LLMs identify mental states in dialogues well but mostly fail to forecast state-consistent future trajectories, except Gemini 3 Pro, with only weak overlap to human inferences.

Receipt and verification
First computed 2026-05-29T01:05:10.263651Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

ba5428168d707a243ba3492e1f2ec093d67ed54a8bdd7030464ea35c0e606628

Aliases

arxiv: 2604.20443 · arxiv_version: 2604.20443v2 · doi: 10.48550/arxiv.2604.20443 · pith_short_12: XJKCQFUNOB5C · pith_short_16: XJKCQFUNOB5CIO5D · pith_short_8: XJKCQFUN
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/XJKCQFUNOB5CIO5DJEXB6LWASP \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: ba5428168d707a243ba3492e1f2ec093d67ed54a8bdd7030464ea35c0e606628
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "a25d475a8c8736d531719bc2b874818b43bdeba61a4bc96432f36c6330245e50",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.LG"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-04-22T11:07:46Z",
    "title_canon_sha256": "2e57dfb93a3f43f4336b19bb1018e135f22eb1596f1a075b039054fad6ab05ba"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2604.20443",
    "kind": "arxiv",
    "version": 2
  }
}