pith. sign in
Pith Number

pith:EJO76EXF

pith:2026:EJO76EXF5I5JF2OJJ5DQI6LG7F
not attested not anchored not stored refs resolved

PolitNuggets: Benchmarking Agentic Discovery of Long-Tail Political Facts

Yifei Zhu

Current AI agents struggle with fine-grained long-tail political facts and show wide variation in discovery efficiency.

arxiv:2605.14002 v1 · 2026-05-13 · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{EJO76EXF5I5JF2OJJ5DQI6LG7F}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Across models and settings, we find that current systems often struggle with fine-grained details, and vary substantially in efficiency. Finally, using benchmark diagnostics, we relate agent performance to underlying model capabilities, highlighting the importance of short-context extraction, multilingual robustness, and reliable tool use.

C2weakest assumption

That the over 10,000 political facts assembled for the 400 biographies are accurate ground-truth long-tail information correctly sourced from dispersed sources and that the FactNet protocol validly measures real-world agentic discovery capability.

C3one line summary

PolitNuggets is a multilingual benchmark showing that AI agents struggle with fine-grained accuracy and efficiency when discovering long-tail political facts for elite biographies, linking performance to short-context extraction, multilingual robustness, and tool use.

References

22 extracted · 22 resolved · 1 Pith anchors

[1] WebGPT: Browser-assisted question-answering with human feedback 2026 · arXiv:2112.09332
[2] InThe Twelfth In- ternational Conference on Learning Representations (ICLR)
[3] Consolidated Ground Truth (CGT).The final pooled, evidence-verified biography nuggets for all 400 entities (including the Wikipedia-coverage filter We), which define the evaluation target G and the dy
[4] Cached webpages.The raw retrieved web pages collected during our agentic runs, fix- ing the search snapshot used for all reported numbers and enabling offline re-evaluation
[5] Reason- inginContext 2025

Formal links

3 machine-checked theorem links

Receipt and verification
First computed 2026-05-17T23:39:13.147969Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

225dff12e5ea3a92e9c94f47047966f970563b0a6cbee3883c06ec28c2d24534

Aliases

arxiv: 2605.14002 · arxiv_version: 2605.14002v1 · doi: 10.48550/arxiv.2605.14002 · pith_short_12: EJO76EXF5I5J · pith_short_16: EJO76EXF5I5JF2OJ · pith_short_8: EJO76EXF
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/EJO76EXF5I5JF2OJJ5DQI6LG7F \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 225dff12e5ea3a92e9c94f47047966f970563b0a6cbee3883c06ec28c2d24534
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "00ddd38154ac3174e2f82d6c785709ee23a7b818d352f15b0ed9f59f9d60c7c7",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2026-05-13T18:09:03Z",
    "title_canon_sha256": "9765c817aa3899abf0d69aa0966fcb8761149bdea13f8535dcb10bacd89a6b26"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.14002",
    "kind": "arxiv",
    "version": 1
  }
}