pith. sign in
Pith Number

pith:N5GYWTFU

pith:2026:N5GYWTFUQPBE7UY42KJXRLIUGR
not attested not anchored not stored refs resolved

SimPersona: Learning Discrete Buyer Personas from Raw Clickstreams for Grounded E-Commerce Agents

Alberto Castelo, Han Li, Lingyun Wang, Shuang Xie, Ted Chaiwachirasak, Zahra Zanjani Foumani

SimPersona learns discrete buyer types from clickstreams to let LLM agents simulate diverse real buyer populations in e-commerce.

arxiv:2605.14205 v1 · 2026-05-14 · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{N5GYWTFUQPBE7UY42KJXRLIUGR}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Evaluated on 8.37M buyers across 42 held-out live storefronts, SimPersona achieves 78% conversion-rate alignment with real buyers, exhibits interpretable behavioral variation across buyer types, and outperforms a baseline with 8× more parameters on goal-oriented shopping tasks.

C2weakest assumption

The discrete buyer types learned from historical clickstreams will transfer effectively to LLM agent behavior in new live interactions without major distribution shift or loss of fidelity.

C3one line summary

SimPersona uses VQ-VAE to induce discrete buyer types from clickstreams, maps them to LLM persona tokens, and fine-tunes agents to achieve 78% conversion-rate alignment with real buyers across 42 storefronts.

References

35 extracted · 35 resolved · 6 Pith anchors

[1] k-means++: The advantages of careful seeding 2007
[2] A dendrite method for cluster analysis.Communications in Statistics – Theory and Methods, 3(1):1–27, 1974 1974
[3] Beyond demographics: Aligning role-playing llm-based agents using human belief networks 2024
[4] Lawrence Erlbaum Associates, 2 edition 1988
[5] Mind2web: Towards a generalist agent for the web 2023
Receipt and verification
First computed 2026-05-17T23:39:11.002190Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

6f4d8b4cb483c24fd31cd29378ad14346ad1ebea3b3304867e188455b4c058fb

Aliases

arxiv: 2605.14205 · arxiv_version: 2605.14205v1 · doi: 10.48550/arxiv.2605.14205 · pith_short_12: N5GYWTFUQPBE · pith_short_16: N5GYWTFUQPBE7UY4 · pith_short_8: N5GYWTFU
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/N5GYWTFUQPBE7UY42KJXRLIUGR \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 6f4d8b4cb483c24fd31cd29378ad14346ad1ebea3b3304867e188455b4c058fb
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "6c48b9c860951119510e3289ae4339f983bda13e19d0d5cf16be4dad608717bc",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2026-05-14T00:01:11Z",
    "title_canon_sha256": "3d0850380e19e90bf30715c20989772fef818895df33e1f9a1c7527cd3d6f2d0"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.14205",
    "kind": "arxiv",
    "version": 1
  }
}