pith. sign in
Pith Number

pith:OCGHI46Q

pith:2026:OCGHI46QRMJI4QB2IZPG3JR3BR
not attested not anchored not stored refs resolved

Towards Self-Evolving Agentic Literature Retrieval

Fenyi Liu, Jing Kang, Jingyi Chai, Siheng Chen, Sikai Yao, Tian Jin, Tingjia Miao, Wenhao Wang, Xianghe Pang, Yuwen Du, Yuzhi Zhang

PaSaMaster turns literature retrieval into a self-evolving process that ranks papers by relevance without generating sources, outperforming GPT-5.2 by 30% at 1% cost with zero hallucinations.

arxiv:2605.14306 v1 · 2026-05-14 · cs.IR

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{OCGHI46QRMJI4QB2IZPG3JR3BR}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

PaSaMaster outperforms GPT-5.2 by 30.0% at a mere 1% of the computational cost while ensuring zero source hallucination, and improves F1-score by 15.6X over traditional keyword retrieval on the PaSaMaster Benchmark across 38 disciplines.

C2weakest assumption

The PaSaMaster Benchmark faithfully represents real-world scientific search intents and that the iterative evidence-ranking process reliably improves results without introducing selection bias or new failure modes.

C3one line summary

PaSaMaster is a self-evolving agentic literature retrieval system that improves F1-score by 15.6X over keyword search and outperforms GPT-5.2 by 30% at 1% cost with zero source hallucination across 38 disciplines.

References

31 extracted · 31 resolved · 1 Pith anchors

[1] PaSa: An LLM Agent for Comprehensive Academic Paper Search , author=. 2025 , eprint= 2025
[2] Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , author=. 2021 , eprint= 2021
[3] Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models , author=. 2025 , eprint= 2025
[4] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , author=. 2019 , eprint= 2019
[5] GPT-4 Technical Report · arXiv:2303.08774

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-17T23:39:10.046137Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

708c7473d08b128e403a465e6da63b0c7668990dfef4c843eb1a6783272f8a0d

Aliases

arxiv: 2605.14306 · arxiv_version: 2605.14306v1 · doi: 10.48550/arxiv.2605.14306 · pith_short_12: OCGHI46QRMJI · pith_short_16: OCGHI46QRMJI4QB2 · pith_short_8: OCGHI46Q
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/OCGHI46QRMJI4QB2IZPG3JR3BR \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 708c7473d08b128e403a465e6da63b0c7668990dfef4c843eb1a6783272f8a0d
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "a5596e4f9985c30701db6317fea7bcfc115e10308268bffdd3de6331fb3cb0cd",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.IR",
    "submitted_at": "2026-05-14T03:17:31Z",
    "title_canon_sha256": "f970c03e19dc3186ddcb644eae14c221773ceeae9488efa97c3161de53f56178"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.14306",
    "kind": "arxiv",
    "version": 1
  }
}