pith. sign in

Yongkang Zhang

Identifiers

  • name variant Yongkang Zhang 0.60 · backfill

Papers (4)

  1. Reducing Credit Assignment Variance via Counterfactual Reasoning Paths cs.LG · 2026 · author #2
  2. Internalizing Outcome Supervision into Process Supervision: A New Paradigm for Reinforcement Learning for Reasoning cs.LG · 2026 · author #2
  3. Rethinking the Comparison Unit in Sequence-Level Reinforcement Learning: An Equal-Length Paired Training Framework from Loss Correction to Sample Construction cs.LG · 2026 · author #2
  4. Design Conditions for Intra-Group Learning of Sequence-Level Rewards: Token Gradient Cancellation cs.LG · 2026 · author #2

Mentions

  • 2605.05226 #2 · arxiv_oai · confidence 0.70 Yongkang Zhang
  • 2604.17328 #2 · arxiv_oai · confidence 0.70 Yongkang Zhang
  • 2604.13088 #2 · arxiv_oai · confidence 0.70 Yongkang Zhang
  • 2605.16302 #2 · arxiv_oai · confidence 0.70 Yongkang Zhang

Frequent Coauthors