pith. sign in

Physics of language models: Part 2.2, how to learn from mistakes on grade-school math problems

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

roles

background 1

polarities

background 1

clear filters

representative citing papers

Search-o1: Agentic Search-Enhanced Large Reasoning Models

cs.AI · 2025-01-09 · unverdicted · novelty 6.0

Search-o1 integrates agentic retrieval-augmented generation and a Reason-in-Documents module into large reasoning models to dynamically supply missing knowledge and improve performance on complex science, math, coding, and QA tasks.

citing papers explorer

Showing 2 of 2 citing papers after filters.

  • Search-o1: Agentic Search-Enhanced Large Reasoning Models cs.AI · 2025-01-09 · unverdicted · none · ref 71

    Search-o1 integrates agentic retrieval-augmented generation and a Reason-in-Documents module into large reasoning models to dynamically supply missing knowledge and improve performance on complex science, math, coding, and QA tasks.

  • Training Language Models to Self-Correct via Reinforcement Learning cs.LG · 2024-09-19 · unverdicted · none · ref 40

    SCoRe uses multi-turn online RL with regularization on self-generated traces to improve LLM self-correction, achieving 15.6% and 9.1% gains on MATH and HumanEval for Gemini models.