Murdock Aubry
Identifiers
No identifiers captured yet.
Papers (1)
- Trust the Batch, On- or Off-Policy: Adaptive Policy Optimization for RL Post-Training cs.LG · 2026 · author #2
Mentions
No mention provenance yet.
Frequent Coauthors
- Alexander J. Smola 1 shared papers
- Nicholas Stranges 1 shared papers
- Rasool Fakoor 1 shared papers