pith. sign in

Zhouhao Sun

Identifiers

No identifiers captured yet.

Papers (4)

  1. GR-Ben: A General Reasoning Benchmark for Evaluating Process Reward Models cs.AI · 2026 · author #1
  2. Consolidation or Adaptation? PRISM: Disentangling SFT and RL Data via Gradient Concentration cs.AI · 2026 · author #8
  3. MAESTRO: Meta-learning Adaptive Estimation of Scalarization Trade-offs for Reward Optimization cs.LG · 2026 · author #8
  4. Large Language Models Are Still Misled by Simple Bias Ensembles cs.CL · 2025 · author #1

Mentions

No mention provenance yet.

Frequent Coauthors