solution_graph

complete_solution (the rigorous step-by-step solution text) Respond EXACTLY in the following format, including the JSON start, end markers: { "solution_graph": { "nodes": [{"id"

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Fine-Grained Benchmark Generation for Comprehensive Evaluation of Foundation Models

cs.LG · 2026-05-12 · unverdicted · novelty 5.0

A multi-agent pipeline generates benchmarks with rich metadata and reliable ground truths from textbooks, yielding three new sets in ML and finance domains that show lower expert-verified error rates and more uniform coverage than MMLU or GSM8K.

citing papers explorer

Showing 1 of 1 citing paper.

Fine-Grained Benchmark Generation for Comprehensive Evaluation of Foundation Models cs.LG · 2026-05-12 · unverdicted · none · ref 36
A multi-agent pipeline generates benchmarks with rich metadata and reliable ground truths from textbooks, yielding three new sets in ML and finance domains that show lower expert-verified error rates and more uniform coverage than MMLU or GSM8K.

solution_graph

fields

years

verdicts

representative citing papers

citing papers explorer