pith. sign in

hub Tool reference

Training verifiers to solve math word problems

Tool reference. 100% of classified Pith citations use this work as a method, library, or software dependency, not as a substantive claim.

19 Pith papers citing it
Method reference 100% of classified citations

hub tools

citation-role summary

dataset 7

citation-polarity summary

roles

dataset 7

polarities

use dataset 7

representative citing papers

GAIA: a benchmark for General AI Assistants

cs.CL · 2023-11-21 · unverdicted · novelty 7.0

GAIA benchmark shows humans at 92% accuracy on simple real-world questions far outperform current AI systems at 15%, proposing this gap as a key milestone for general AI.

Agentic Systems as Boosting Weak Reasoning Models

cs.AI · 2026-05-13 · unverdicted · novelty 6.0

Verifier-backed committee search boosts a weak reasoning model from 67% to 76.4% on SWE-bench Verified, matching stronger models by using local soundness signals to select among proposals.

RAGEN-2: Reasoning Collapse in Agentic RL

cs.LG · 2026-04-07 · unverdicted · novelty 6.0

Template collapse is a distinct failure mode in agentic RL invisible to entropy; mutual information proxies diagnose it better and SNR-aware filtering using reward variance improves input-dependent reasoning and task performance across planning, math, navigation, and code tasks.

Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG

cs.AI · 2025-01-15 · unverdicted · novelty 4.0

Agentic RAG embeds agents with reflection, planning, tool use, and collaboration into retrieval pipelines to overcome static RAG limitations, and the survey offers a taxonomy by agent count, control, autonomy, and knowledge representation plus applications and open challenges.

citing papers explorer

Showing 19 of 19 citing papers.