pith. sign in

Guowei Rong

Identifiers

  • name variant Guowei Rong 0.60 · backfill

Papers (1)

  1. Mitigating Reward Hacking in RLHF via Bayesian Non-negative Reward Modeling cs.LG · 2026 · author #2

Mentions

  • 2602.10623 #2 · arxiv_oai · confidence 0.70 Guowei Rong

Frequent Coauthors