pith. sign in

Rohan Surana

Identifiers

No identifiers captured yet.

Papers (5)

  1. F-GRPO: Factorized Group-Relative Policy Optimization for Unified Candidate Generation and Ranking cs.LG · 2026 · author #1
  2. MASS-DPO: Multi-negative Active Sample Selection for Direct Policy Optimization cs.LG · 2026 · author #1
  3. Skill-R1: Agent Skill Evolution via Reinforcement Learning cs.LG · 2026 · author #2
  4. Skill-CMIB: Multimodal Agent Skill for Consistent Action via Conditional Multimodal Information Bottleneck cs.LG · 2026 · author #5
  5. Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning cs.LG · 2026 · author #1

Mentions

No mention provenance yet.

Frequent Coauthors