Pith Number

pith:LPRDPQ2T

pith:2026:LPRDPQ2T2SVLAJDA2YZJOZRF4M

not attested not anchored not stored refs resolved

From Node2Vec to GPT-based GraphRAG: scientific impact prediction across graph and language models

Adilson Vital Jr., Diego R. Amancio, Filipi N. Silva

Directed citation graphs combined with textual embeddings predict scientific impact with 0.84-0.85 AUC, while GPT prompts without retrieval often match GraphRAG performance at 0.87.

arxiv:2605.18410 v1 · 2026-05-18 · cs.DL

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{LPRDPQ2T2SVLAJDA2YZJOZRF4M}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

The best supervised configuration combines directed citation graphs with textual embeddings, reaching approximately 0.84-0.85 AUC. [...] target-only prompts often perform as well as or better than GraphRAG prompts achieving the 0.87 mark.

C2weakest assumption

That cohort-normalized top-P% citation rank measured years later is a stable and unbiased proxy for scientific impact that can be meaningfully predicted from information available at publication time under temporal graph constraints.

C3one line summary

Directed citation graphs plus textual embeddings reach 0.84-0.85 AUC for top-P% impact classification while GPT-5.5/5.4 Nano prompts hit 0.87 but show no consistent gain from retrieved graph neighborhoods over target-only baselines.

References

57 extracted · 57 resolved · 3 Pith anchors

[1] The number of papers published yearly is shown in Figure 2 2009

[2] For both graph types we created four variations based on (i) edge direction (directed vs 2009

[3] Each graph family is further expanded by two edge direction types (directed vs

[4] Impact classification In the final phase, we use the embeddings from the previous phase as inputs to a su- pervised classification model that predicts whether each paper will be a “top paper” un- der 2048

[5] We retain the 13 FIG 2050

Receipt and verification

First computed	2026-05-20T00:05:59.384559Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

5be237c353d4aab02460d632976625e3285982c2d5453351574a1ff00bdd9bf3

Aliases

arxiv: 2605.18410 · arxiv_version: 2605.18410v1 · doi: 10.48550/arxiv.2605.18410 · pith_short_12: LPRDPQ2T2SVL · pith_short_16: LPRDPQ2T2SVLAJDA · pith_short_8: LPRDPQ2T

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/LPRDPQ2T2SVLAJDA2YZJOZRF4M \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 5be237c353d4aab02460d632976625e3285982c2d5453351574a1ff00bdd9bf3

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "b59dc24f3b7d5fa91faa1d83431ab5b3442f61fd6d8bb509f7d43ac679039990",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.DL",
    "submitted_at": "2026-05-18T13:48:58Z",
    "title_canon_sha256": "f37eaa24c0d138982c7c9e9ca38bad50bef5aec81aa9c6f80a24dc404292649b"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.18410",
    "kind": "arxiv",
    "version": 1
  }
}