Redpajama: an open dataset for training large language models

Maurice Weber, Daniel Fu, Quentin Anthony, Yonatan Oren, Shane Adams, Anton Alexandrov, Xiaozhong Lyu, Huu Nguyen, Xiaozhe Yao, Virginia Adams, et al · 2024 · arXiv 2411.12372

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models

cs.CL · 2025-08-09 · conditional · novelty 6.0

A progressive training scheme with binary-aware initialization and dual-scaling allows pre-trained LLMs to be converted to high-performance 1-bit models without training from scratch.

Beyond URLs: Metadata Diversity and Position for Efficient LLM Pretraining

cs.CL · 2025-11-26 · unverdicted · novelty 5.0

Fine-grained metadata such as document quality indicators accelerate LLM pretraining when prepended, and metadata appending plus learnable meta-tokens recover additional speedup via auxiliary tasks and latent structure.

citing papers explorer

Showing 2 of 2 citing papers.

Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models cs.CL · 2025-08-09 · conditional · none · ref 39
A progressive training scheme with binary-aware initialization and dual-scaling allows pre-trained LLMs to be converted to high-performance 1-bit models without training from scratch.
Beyond URLs: Metadata Diversity and Position for Efficient LLM Pretraining cs.CL · 2025-11-26 · unverdicted · none · ref 23
Fine-grained metadata such as document quality indicators accelerate LLM pretraining when prepended, and metadata appending plus learnable meta-tokens recover additional speedup via auxiliary tasks and latent structure.

Redpajama: an open dataset for training large language models

fields

years

verdicts

representative citing papers

citing papers explorer