Matharena: Evaluating llms on uncontaminated math competitions.Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmark, 2025

Mislav Balunovi´c, Jasper Dekoninck, Ivo Petrov, Nikola Jovanovi ´c, Martin Vechev · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Test-Time Learning with an Evolving Library

cs.LG · 2026-05-14 · unverdicted · novelty 7.0

EvoLib enables LLMs to accumulate, reuse, and evolve knowledge abstractions from inference trajectories at test time, yielding substantial gains on math reasoning, code generation, and agentic benchmarks without parameter updates or supervision.

citing papers explorer

Showing 1 of 1 citing paper.

Test-Time Learning with an Evolving Library cs.LG · 2026-05-14 · unverdicted · none · ref 44
EvoLib enables LLMs to accumulate, reuse, and evolve knowledge abstractions from inference trajectories at test time, yielding substantial gains on math reasoning, code generation, and agentic benchmarks without parameter updates or supervision.

Matharena: Evaluating llms on uncontaminated math competitions.Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmark, 2025

fields

years

verdicts

representative citing papers

citing papers explorer