pith. sign in

Towards robust mathematical reasoning

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

dataset 2

citation-polarity summary

fields

cs.AI 2 cs.LO 1

years

2026 3

verdicts

UNVERDICTED 3

roles

dataset 2

polarities

use dataset 2

clear filters

representative citing papers

Pseudo-Formalization for Automatic Proof Verification

cs.LO · 2026-05-19 · unverdicted · novelty 7.0

Pseudo-Formalization decomposes proofs into self-contained natural language modules for independent LLM-based Block Verification, outperforming LLM-as-judge baselines on olympiad and research math benchmarks while releasing ArxivMathGradingBench.

citing papers explorer

Showing 3 of 3 citing papers after filters.

  • Pseudo-Formalization for Automatic Proof Verification cs.LO · 2026-05-19 · unverdicted · none · ref 20

    Pseudo-Formalization decomposes proofs into self-contained natural language modules for independent LLM-based Block Verification, outperforming LLM-as-judge baselines on olympiad and research math benchmarks while releasing ArxivMathGradingBench.

  • Seir\^enes: Adversarial Self-Play with Evolving Distractions for LLM Reasoning cs.AI · 2026-05-12 · unverdicted · none · ref 43

    Seirênes trains LLMs via adversarial self-play to generate and overcome evolving distractions, producing gains of 7-10 points on math reasoning benchmarks and exposing blind spots in larger models.

  • Seed1.8 Model Card: Towards Generalized Real-World Agency cs.AI · 2026-03-21 · unverdicted · none · ref 42

    Seed1.8 is a new foundation model that adds unified agentic capabilities for search, code execution, and GUI interaction to existing LLM and vision strengths.