pith. sign in

hub Canonical reference

Galactica: A Large Language Model for Science

Canonical reference. 85% of citing Pith papers cite this work as background.

58 Pith papers citing it
Background 85% of classified citations
abstract

Information overload is a major obstacle to scientific progress. The explosive growth in scientific literature and data has made it ever harder to discover useful insights in a large mass of information. Today scientific knowledge is accessed through search engines, but they are unable to organize scientific knowledge alone. In this paper we introduce Galactica: a large language model that can store, combine and reason about scientific knowledge. We train on a large scientific corpus of papers, reference material, knowledge bases and many other sources. We outperform existing models on a range of scientific tasks. On technical knowledge probes such as LaTeX equations, Galactica outperforms the latest GPT-3 by 68.2% versus 49.0%. Galactica also performs well on reasoning, outperforming Chinchilla on mathematical MMLU by 41.3% to 35.7%, and PaLM 540B on MATH with a score of 20.4% versus 8.8%. It also sets a new state-of-the-art on downstream tasks such as PubMedQA and MedMCQA dev of 77.6% and 52.9%. And despite not being trained on a general corpus, Galactica outperforms BLOOM and OPT-175B on BIG-bench. We believe these results demonstrate the potential for language models as a new interface for science. We open source the model for the benefit of the scientific community.

hub tools

citation-role summary

background 11 baseline 1 method 1

citation-polarity summary

clear filters

representative citing papers

ACL-Verbatim: hallucination-free question answering for research

cs.CL · 2026-05-20 · unverdicted · novelty 7.0

The work creates a new ground truth dataset for mapping queries to verbatim text spans in research papers and shows a 150M-parameter ModernBERT token classifier achieving 53.6 word-level F1, outperforming LLM extractors at 48.7.

SPARK: Self-Play with Asymmetric Reward from Knowledge Graphs

cs.AI · 2026-05-07 · unverdicted · novelty 6.0

SPARK constructs unified knowledge graphs from multi-document scientific literature to ground self-play RL with asymmetric roles and verifiable rewards, outperforming flat-corpus baselines especially on longer-hop reasoning tasks.

Capacity-Aware Mixture Law Enables Efficient LLM Data Optimization

cs.LG · 2026-03-09 · unverdicted · novelty 6.0

CAMEL is a scaling law capturing nonlinear model-size and mixture interactions to extrapolate optimal data mixtures for large LLMs from small-model experiments, reducing optimization cost by 50% and improving benchmarks by up to 3%.

citing papers explorer

Showing 9 of 9 citing papers after filters.

  • MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models cs.CL · 2023-09-21 · conditional · none · ref 68 · internal anchor

    Bootstrapping math questions via rewriting creates MetaMathQA; fine-tuning LLaMA-2 on it yields 66.4% on GSM8K for 7B and 82.3% for 70B, beating prior same-size models by large margins.

  • MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning cs.CL · 2023-09-11 · conditional · none · ref 43 · internal anchor

    MAmmoTH models trained via hybrid CoT-PoT instruction tuning on MathInstruct outperform prior open-source LLMs by 16-32% average accuracy on nine math datasets, reaching 33% and 44% on MATH for 7B and 34B scales.

  • Nougat: Neural Optical Understanding for Academic Documents cs.LG · 2023-08-25 · conditional · none · ref 35 · internal anchor

    Nougat applies a visual transformer to convert academic PDFs into markup language while accurately handling mathematical content on a new scientific document dataset.

  • Scaling Data-Constrained Language Models cs.CL · 2023-05-25 · conditional · none · ref 116 · internal anchor

    Repeating training data up to 4 epochs yields negligible loss increase versus unique data for fixed compute, and a new scaling law accounts for the decaying value of repeated tokens and excess parameters.

  • BloombergGPT: A Large Language Model for Finance cs.LG · 2023-03-30 · conditional · none · ref 117 · internal anchor

    BloombergGPT is a 50B parameter LLM trained on a 708B token mixed financial and general dataset that outperforms prior models on financial benchmarks while preserving general LLM performance.

  • Baichuan 2: Open Large-scale Language Models cs.CL · 2023-09-19 · unverdicted · none · ref 67 · internal anchor

    Baichuan 2 presents 7B and 13B LLMs trained on 2.6T tokens that match or exceed similar open models on MMLU, CMMLU, GSM8K, HumanEval and excel in medicine and law.

  • A Survey of Large Language Models cs.CL · 2023-03-31 · accept · none · ref 37 · internal anchor

    This survey reviews the background, key techniques, and evaluation methods for large language models, emphasizing emergent abilities that appear at large scales.

  • Natural Language Processing in the Legal Domain cs.CL · 2023-02-23 · unverdicted · none · ref 65 · internal anchor

    A survey of nearly 1000 NLP & Law papers from 2013-2024 documenting increases in publication volume, scope, methodological sophistication, and data/code availability.

  • A Comprehensive Overview of Large Language Models cs.CL · 2023-07-12 · unverdicted · none · ref 148 · internal anchor

    A survey paper providing an overview of Large Language Models, their background, and recent advances in the field.