MathAtlas is the first large-scale benchmark for autoformalizing graduate mathematics, where even strong models reach only 9.8% correctness on theorem statements and drop to 2.6% on the hardest dependency-deep subset.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2representative citing papers
DeltaPrompts generates 200k synthetic high-divergence reasoning prompts to escape zero-delta saturation in multimodal distillation, yielding up to 15% relative gains on chart, document, and perception benchmarks across multiple settings.
citing papers explorer
-
MathAtlas: A Benchmark for Autoformalization in the Wild
MathAtlas is the first large-scale benchmark for autoformalizing graduate mathematics, where even strong models reach only 9.8% correctness on theorem statements and drop to 2.6% on the hardest dependency-deep subset.
-
DeltaPrompts: Escaping the Zero-Delta Trap in Multimodal Distillation
DeltaPrompts generates 200k synthetic high-divergence reasoning prompts to escape zero-delta saturation in multimodal distillation, yielding up to 15% relative gains on chart, document, and perception benchmarks across multiple settings.