Pith Number

pith:PUSHK26R

pith:2023:PUSHK26RYUT65IRF2HJ47GK7W2

not attested not anchored not stored refs resolved

BloombergGPT: A Large Language Model for Finance

David Rosenberg, Gideon Mann, Mark Dredze, Ozan Irsoy, Prabhanjan Kambadur, Sebastian Gehrmann, Shijie Wu, Steven Lu, Vadim Dabravolski

BloombergGPT, a 50 billion parameter model trained on financial plus general data, outperforms prior models on financial tasks while preserving general LLM performance.

arxiv:2303.17564 v3 · 2023-03-30 · cs.LG · cs.AI · cs.CL · q-fin.GN

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{PUSHK26RYUT65IRF2HJ47GK7W2}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our mixed dataset training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general LLM benchmarks.

C2weakest assumption

That the internal benchmarks and chosen financial data sources accurately reflect real-world usage and that the performance gains are not due to dataset-specific artifacts or evaluation choices.

C3one line summary

BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.

References

140 extracted · 140 resolved · 32 Pith anchors

[1] FinBERT: Financial Sentiment Analysis with Pre-trained Language Models 1908 · arXiv:1908.10063

[2] PLATO - XL : Exploring the large-scale pre-training of dialogue generation 2022

[3] S ci BERT : A pretrained language model for scientific text 2019 · doi:10.18653/v1/d19-1371

[4] On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pages 610--623 2021

[5] The fifth PASCAL recognizing textual entailment challenge 2009

Formal links

2 machine-checked theorem links

Cited by

54 papers in Pith

Are Frontier LLMs Ready for Cybersecurity? Evidence for Vertical Foundation Models from Dual-Mode Vulnerability Benchmarks

ConfusionPrompt: Practical Private Inference for Online Large Language Models

Bridging Language Models and Financial Analysis

MulFSA: Multi-level Financial Sentiment Analysis Framework for Bond Market

Deploying Large AI Models on Resource-Limited Devices with Split Federated Learning

Receipt and verification

First computed	2026-05-18T03:47:53.393747Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

7d24756bd1c527eea225d1d3cf995fb6a5eaea0be5362d70ab1712618bfb7c58

Aliases

arxiv: 2303.17564 · arxiv_version: 2303.17564v3 · doi: 10.48550/arxiv.2303.17564 · pith_short_12: PUSHK26RYUT6 · pith_short_16: PUSHK26RYUT65IRF · pith_short_8: PUSHK26R

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/PUSHK26RYUT65IRF2HJ47GK7W2 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 7d24756bd1c527eea225d1d3cf995fb6a5eaea0be5362d70ab1712618bfb7c58

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "3c618fff827861fdc6b1d501a9e1e1d2a66362df35cf96dc55522e3df3e43035",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL",
      "q-fin.GN"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2023-03-30T17:30:36Z",
    "title_canon_sha256": "ce000b4019446a1232badc49fdffe0f3fa25a4751e0d195e87cf1b1f46bd0aff"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2303.17564",
    "kind": "arxiv",
    "version": 3
  }
}