When evaluating the resulting models, we use the evaluation frameworkLM Evaluation Harnessand the default tasksgsm8kand hendrycks_math(Gao et al
· 2024
1 Pith paper cite this work. Polarity classification is still indexing.
Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.