pith:SSEYKZZH
Reconstructing temporal multi-relational firm networks at scale using large language models. The case of the semiconductor industry
Large language models can extract supply-chain, partnership and ownership links from public webpages to build a temporal network of over 1,300 semiconductor firms.
arxiv:2605.15842 v1 · 2026-05-15 · physics.soc-ph · cs.SI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{SSEYKZZHIPFEHOCVMHW6EBTPAN}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
a novel, generalizable methodology combining Large Language Models (LLMs) with open web data can reconstruct this network and its structural dynamics at scale... yielding a temporal network of over 1,300 linked firms. We validate link-extraction quality (Precision: 0.884; F1-score: 0.784), network overlap and complementarity with a proprietary database, and consistency with aggregate economic data.
The assumption that publicly available firm webpages contain sufficiently complete and unbiased information on supply-chain, partnership, and ownership relations, and that LLM extraction can reliably classify these links without systematic errors that would distort network structure or temporal dynamics.
LLM-based extraction from open web data reconstructs a validated temporal multi-relational network of semiconductor firms and reveals dynamics such as a 9% edge decline during the 2022 chip shortage.
References
Formal links
Receipt and verification
| First computed | 2026-05-20T00:01:21.309892Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
948985672743ca43b85561ede2066f0362243d24dd5ac48ef3c55841d6e02866
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/SSEYKZZHIPFEHOCVMHW6EBTPAN \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 948985672743ca43b85561ede2066f0362243d24dd5ac48ef3c55841d6e02866
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "39e0b9d6e05d9cda15512d5d2c1b2bb97bb77a3e3f9d1e28f8f2183e2d146d83",
"cross_cats_sorted": [
"cs.SI"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "physics.soc-ph",
"submitted_at": "2026-05-15T10:55:03Z",
"title_canon_sha256": "1d0180c73a182a259f2bb329e5cc4f2b7d0baed91be1b8e672f6ba06d67eabd7"
},
"schema_version": "1.0",
"source": {
"id": "2605.15842",
"kind": "arxiv",
"version": 1
}
}