pith:H2M3KJ73
Lost in Decoding? Reproducing and Stress-Testing the Look-Ahead Prior in Generative Retrieval
PAG's planning signal in generative retrieval collapses under intent-preserving typos.
arxiv:2604.23396 v1 · 2026-04-25 · cs.IR · cs.AI · cs.CL · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{H2M3KJ733MRYFZ3A6W24Y44MXL}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
PAG's planning signal is brittle under lexical surface-form variation: intent-preserving typos can trigger plan collapse, where the planned candidate pool shifts enough that the look-ahead bonus provides little useful guidance, effectively reverting decoding toward weaker unguided search.
That the plan drift diagnostics and cross-lingual tests isolate the stability of the planning signal without being confounded by specific beam sizes, trie construction details, or unstated differences between the released checkpoint and the original training run.
Reproduction confirms PAG boosts generative retrieval effectiveness, but its look-ahead planning signal collapses under intent-preserving typos and query mismatches, reverting performance to unguided decoding.
References
Receipt and verification
| First computed | 2026-05-26T01:03:31.176657Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
3e99b527fbdb2382e760f5b5cc738cbad9069278ec43b042fd2ce00c565c1087
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/H2M3KJ733MRYFZ3A6W24Y44MXL \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 3e99b527fbdb2382e760f5b5cc738cbad9069278ec43b042fd2ce00c565c1087
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "ca009d26a660ddfdc83d2ad226545f6b02427f8d17ebd14270feb38aeadff4c6",
"cross_cats_sorted": [
"cs.AI",
"cs.CL",
"cs.LG"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.IR",
"submitted_at": "2026-04-25T17:58:15Z",
"title_canon_sha256": "b2d8b74091a1eda5fbcbf95ff63dbb39e8bdbcfe97af4499246ca91f5954cbbe"
},
"schema_version": "1.0",
"source": {
"id": "2604.23396",
"kind": "arxiv",
"version": 1
}
}