{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2023:PUSHK26RYUT65IRF2HJ47GK7W2","short_pith_number":"pith:PUSHK26R","schema_version":"1.0","canonical_sha256":"7d24756bd1c527eea225d1d3cf995fb6a5eaea0be5362d70ab1712618bfb7c58","source":{"kind":"arxiv","id":"2303.17564","version":3},"attestation_state":"computed","paper":{"title":"BloombergGPT: A Large Language Model for Finance","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"BloombergGPT, a 50 billion parameter model trained on financial plus general data, outperforms prior models on financial tasks while preserving general LLM performance.","cross_cats":["cs.AI","cs.CL","q-fin.GN"],"primary_cat":"cs.LG","authors_text":"David Rosenberg, Gideon Mann, Mark Dredze, Ozan Irsoy, Prabhanjan Kambadur, Sebastian Gehrmann, Shijie Wu, Steven Lu, Vadim Dabravolski","submitted_at":"2023-03-30T17:30:36Z","abstract_excerpt":"The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in literature. In this work, we present BloombergGPT, a 50 billion parameter language model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg's extensive data sources, perhaps the largest domain-specific data"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":true,"formal_links_present":true},"canonical_record":{"source":{"id":"2303.17564","kind":"arxiv","version":3},"metadata":{"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.LG","submitted_at":"2023-03-30T17:30:36Z","cross_cats_sorted":["cs.AI","cs.CL","q-fin.GN"],"title_canon_sha256":"ce000b4019446a1232badc49fdffe0f3fa25a4751e0d195e87cf1b1f46bd0aff","abstract_canon_sha256":"3c618fff827861fdc6b1d501a9e1e1d2a66362df35cf96dc55522e3df3e43035"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-18T03:47:53.394418Z","signature_b64":"xIVEqz99PRJVIkww6RxvXbQKyYUeKtNSnM5iyu+o5xqBDo9JqiA+RiKV3m4TP8grTzasVg+hc1QhFm41KF/4AQ==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"7d24756bd1c527eea225d1d3cf995fb6a5eaea0be5362d70ab1712618bfb7c58","last_reissued_at":"2026-05-18T03:47:53.393747Z","signature_status":"signed_v1","first_computed_at":"2026-05-18T03:47:53.393747Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"BloombergGPT: A Large Language Model for Finance","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"BloombergGPT, a 50 billion parameter model trained on financial plus general data, outperforms prior models on financial tasks while preserving general LLM performance.","cross_cats":["cs.AI","cs.CL","q-fin.GN"],"primary_cat":"cs.LG","authors_text":"David Rosenberg, Gideon Mann, Mark Dredze, Ozan Irsoy, Prabhanjan Kambadur, Sebastian Gehrmann, Shijie Wu, Steven Lu, Vadim Dabravolski","submitted_at":"2023-03-30T17:30:36Z","abstract_excerpt":"The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in literature. In this work, we present BloombergGPT, a 50 billion parameter language model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg's extensive data sources, perhaps the largest domain-specific data"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Our mixed dataset training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general LLM benchmarks.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the internal benchmarks and chosen financial data sources accurately reflect real-world usage and that the performance gains are not due to dataset-specific artifacts or evaluation choices.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"BloombergGPT, a 50 billion parameter model trained on financial plus general data, outperforms prior models on financial tasks while preserving general LLM performance.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"a89833c5bc818881b7ff6f8cb2096d8697ff892e4f0cd9653527f9be64826198"},"source":{"id":"2303.17564","kind":"arxiv","version":3},"verdict":{"id":"fb5079f8-d635-4cc5-8715-ec8dd5937eec","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-13T23:14:46.985302Z","strongest_claim":"Our mixed dataset training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general LLM benchmarks.","one_line_summary":"BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the internal benchmarks and chosen financial data sources accurately reflect real-world usage and that the performance gains are not due to dataset-specific artifacts or evaluation choices.","pith_extraction_headline":"BloombergGPT, a 50 billion parameter model trained on financial plus general data, outperforms prior models on financial tasks while preserving general LLM performance."},"references":{"count":140,"sample":[{"doi":"","year":1908,"title":"FinBERT: Financial Sentiment Analysis with Pre-trained Language Models","work_id":"3dd01f6f-6c0f-4a47-a6dd-6ca15bc4e219","ref_index":1,"cited_arxiv_id":"1908.10063","is_internal_anchor":true},{"doi":"","year":2022,"title":"PLATO - XL : Exploring the large-scale pre-training of dialogue generation","work_id":"c1c6d296-00b7-46ca-b0d9-8274a884fa42","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"10.18653/v1/d19-1371","year":2019,"title":"S ci BERT : A pretrained language model for scientific text","work_id":"f4322284-a0f8-4f17-b855-c0eccacb4545","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2021,"title":"On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pages 610--623","work_id":"717f4c67-3986-48a0-b9f6-50e30b658cbf","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2009,"title":"The fifth PASCAL recognizing textual entailment challenge","work_id":"b6e103a5-a597-4905-b033-4a41b167e7ac","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":140,"snapshot_sha256":"0af8286cfd92006a17cd11e70fb5a030bf5babcb880c2ba0c7e9ded791457812","internal_anchors":32},"formal_canon":{"evidence_count":2,"snapshot_sha256":"22a43aa6f5f72f4089a51719a6fac9a632a9ea4598046a111477a4a1d49cda8f"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2303.17564","created_at":"2026-05-18T03:47:53.393859+00:00"},{"alias_kind":"arxiv_version","alias_value":"2303.17564v3","created_at":"2026-05-18T03:47:53.393859+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2303.17564","created_at":"2026-05-18T03:47:53.393859+00:00"},{"alias_kind":"pith_short_12","alias_value":"PUSHK26RYUT6","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_16","alias_value":"PUSHK26RYUT65IRF","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_8","alias_value":"PUSHK26R","created_at":"2026-05-18T12:33:37.589309+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":54,"internal_anchor_count":54,"sample":[{"citing_arxiv_id":"2605.23243","citing_title":"Are Frontier LLMs Ready for Cybersecurity? Evidence for Vertical Foundation Models from Dual-Mode Vulnerability Benchmarks","ref_index":11,"is_internal_anchor":true},{"citing_arxiv_id":"2401.00870","citing_title":"ConfusionPrompt: Practical Private Inference for Online Large Language Models","ref_index":5,"is_internal_anchor":true},{"citing_arxiv_id":"2503.22693","citing_title":"Bridging Language Models and Financial Analysis","ref_index":106,"is_internal_anchor":true},{"citing_arxiv_id":"2504.02429","citing_title":"MulFSA: Multi-level Financial Sentiment Analysis Framework for Bond Market","ref_index":13,"is_internal_anchor":true},{"citing_arxiv_id":"2504.09114","citing_title":"Deploying Large AI Models on Resource-Limited Devices with Split Federated Learning","ref_index":4,"is_internal_anchor":true},{"citing_arxiv_id":"2605.21975","citing_title":"Reasoning through Verifiable Forecast Actions: Consistency-Grounded RL for Financial LLMs","ref_index":13,"is_internal_anchor":true},{"citing_arxiv_id":"2605.15104","citing_title":"From Text to Voice: A Reproducible and Verifiable Framework for Evaluating Tool Calling LLM Agents","ref_index":40,"is_internal_anchor":true},{"citing_arxiv_id":"2605.15156","citing_title":"MeMo: Memory as a Model","ref_index":8,"is_internal_anchor":true},{"citing_arxiv_id":"2605.16776","citing_title":"Distinguishable Deletion: Unifying Knowledge Erasure and Refusal for Large Language Model Unlearning","ref_index":59,"is_internal_anchor":true},{"citing_arxiv_id":"2307.06435","citing_title":"A Comprehensive Overview of Large Language Models","ref_index":151,"is_internal_anchor":true},{"citing_arxiv_id":"2604.05966","citing_title":"FinReporting: An Agentic Workflow for Localized Reporting of Cross-Jurisdiction Financial Disclosures","ref_index":2,"is_internal_anchor":true},{"citing_arxiv_id":"2605.15412","citing_title":"From Feedback Loops to Policy Updates: Reinforcement Fine-Tuning for LLM-Based Alpha Factor Discovery","ref_index":61,"is_internal_anchor":true},{"citing_arxiv_id":"2506.11512","citing_title":"From Time Series Analysis to Question Answering: A Survey in the LLM Era","ref_index":108,"is_internal_anchor":true},{"citing_arxiv_id":"2506.13538","citing_title":"Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers","ref_index":148,"is_internal_anchor":true},{"citing_arxiv_id":"2507.06850","citing_title":"The Dark Side of LLMs: Agent-based Attack Vectors for System-level Compromise","ref_index":30,"is_internal_anchor":true},{"citing_arxiv_id":"2507.08339","citing_title":"What Factors Affect LLMs and RLLMs in Financial Question Answering?","ref_index":16,"is_internal_anchor":true},{"citing_arxiv_id":"2509.10546","citing_title":"Learning to Conceal Risk: Controllable Multi-turn Red Teaming for LLMs in the Financial Domain","ref_index":36,"is_internal_anchor":true},{"citing_arxiv_id":"2509.07177","citing_title":"Towards EnergyGPT: A Large Language Model Specialized for the Energy Sector","ref_index":4,"is_internal_anchor":true},{"citing_arxiv_id":"2509.09544","citing_title":"MetaGraph: A Large-Scale Meta-Analysis of GenAI in Financial NLP (2022-2025)","ref_index":49,"is_internal_anchor":true},{"citing_arxiv_id":"2509.14594","citing_title":"SynBench: A Benchmark for Differentially Private Text Generation","ref_index":45,"is_internal_anchor":true},{"citing_arxiv_id":"2509.13047","citing_title":"Multi-Model Synthetic Training for Mission-Critical Small Language Models","ref_index":5,"is_internal_anchor":true},{"citing_arxiv_id":"2509.21637","citing_title":"BoHA: Blockwise Hadamard Product Adaptation for Parameter-Efficient Fine-Tuning","ref_index":2,"is_internal_anchor":true},{"citing_arxiv_id":"2511.13131","citing_title":"MM-Telco: Benchmarks and Multimodal Large Language Models for Telecom Applications","ref_index":41,"is_internal_anchor":true},{"citing_arxiv_id":"2512.13040","citing_title":"Understanding Structured Financial Data with LLMs: A Case Study on Fraud Detection","ref_index":43,"is_internal_anchor":true},{"citing_arxiv_id":"2603.12564","citing_title":"Sell Me This Stock: Unsafe Recommendation Drift in LLM Agents","ref_index":6,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":2,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/PUSHK26RYUT65IRF2HJ47GK7W2","json":"https://pith.science/pith/PUSHK26RYUT65IRF2HJ47GK7W2.json","graph_json":"https://pith.science/api/pith-number/PUSHK26RYUT65IRF2HJ47GK7W2/graph.json","events_json":"https://pith.science/api/pith-number/PUSHK26RYUT65IRF2HJ47GK7W2/events.json","paper":"https://pith.science/paper/PUSHK26R"},"agent_actions":{"view_html":"https://pith.science/pith/PUSHK26RYUT65IRF2HJ47GK7W2","download_json":"https://pith.science/pith/PUSHK26RYUT65IRF2HJ47GK7W2.json","view_paper":"https://pith.science/paper/PUSHK26R","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2303.17564&json=true","fetch_graph":"https://pith.science/api/pith-number/PUSHK26RYUT65IRF2HJ47GK7W2/graph.json","fetch_events":"https://pith.science/api/pith-number/PUSHK26RYUT65IRF2HJ47GK7W2/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/PUSHK26RYUT65IRF2HJ47GK7W2/action/timestamp_anchor","attest_storage":"https://pith.science/pith/PUSHK26RYUT65IRF2HJ47GK7W2/action/storage_attestation","attest_author":"https://pith.science/pith/PUSHK26RYUT65IRF2HJ47GK7W2/action/author_attestation","sign_citation":"https://pith.science/pith/PUSHK26RYUT65IRF2HJ47GK7W2/action/citation_signature","submit_replication":"https://pith.science/pith/PUSHK26RYUT65IRF2HJ47GK7W2/action/replication_record"}},"created_at":"2026-05-18T03:47:53.393859+00:00","updated_at":"2026-05-18T03:47:53.393859+00:00"}