MosaicMRI provides a diverse raw MSK MRI dataset that enables deep learning models to exploit cross-anatomical correlations, outperforming anatomy-specific training in low-sample regimes for accelerated reconstruction.
Rae, Oriol Vinyals, and Laurent Sifre
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
FineWeb is a curated 15T-token web dataset that produces stronger LLMs than prior open collections, while its educational subset sharply improves performance on MMLU and ARC benchmarks.
Pre-training loss predicts LLM math reasoning better than parameter count; rejection sampling fine-tuning with diverse paths raises LLaMA-7B accuracy on GSM8K from 35.9% with SFT to 49.3%.
Distributed systems in biology, economics, and computing optimize productivity by converging on maximum feasible heterogeneity, with environmental demands and communication topology setting the limits.
citing papers explorer
-
MosaicMRI: A Diverse Dataset and Benchmark for Raw Musculoskeletal MRI
MosaicMRI provides a diverse raw MSK MRI dataset that enables deep learning models to exploit cross-anatomical correlations, outperforming anatomy-specific training in low-sample regimes for accelerated reconstruction.
-
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale
FineWeb is a curated 15T-token web dataset that produces stronger LLMs than prior open collections, while its educational subset sharply improves performance on MMLU and ARC benchmarks.
-
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
Pre-training loss predicts LLM math reasoning better than parameter count; rejection sampling fine-tuning with diverse paths raises LLaMA-7B accuracy on GSM8K from 35.9% with SFT to 49.3%.
-
The Principle of Maximum Heterogeneity Optimises Productivity in Distributed Production Systems Across Biology, Economics, and Computing
Distributed systems in biology, economics, and computing optimize productivity by converging on maximum feasible heterogeneity, with environmental demands and communication topology setting the limits.
- Lessons from the Trenches on Reproducible Evaluation of Language Models