pith. sign in

Prover-verifier games improve legibility of llm outputs

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

clear filters

representative citing papers

Self-Trained Verification for Training- and Test-Time Self-Improvement

cs.LG · 2026-05-28 · unverdicted · novelty 6.0

Self-trained verification trains verifiers to imitate informed versions of themselves using reference solutions, improving test-time V-R loops and training-time self-improvement with reported gains of 2x on hard math and 14x on scientific reasoning.

CLORE: Content-Level Optimization for Reasoning Efficiency

cs.AI · 2026-05-21 · unverdicted · novelty 6.0

CLORE augments correct on-policy rollouts by deleting repetitive and irrelevant segments then optimizes with auxiliary DPO to improve accuracy-efficiency trade-off on math benchmarks.

Calibrating Conservatism for Scalable Oversight

cs.AI · 2026-05-27 · unverdicted · novelty 5.0

CCO aggregates scoring functions into a calibrated penalty using conformal decision theory to enforce target violation rates for AI oversight on benchmarks like modified SWE-bench and MACHIAVELLI.

citing papers explorer

Showing 1 of 1 citing paper after filters.