pith. sign in

Aviral Kumar

Identifiers

  • name variant Aviral Kumar 0.60 · backfill

Papers (21)

  1. AsyncWebRL: Efficient Multi-Step RL for Visual Web Agents cs.LG · 2026 · author #5
  2. Recursive Agent Optimization cs.LG · 2026 · author #4
  3. QED-Nano: Teaching a Tiny Model to Prove Hard Theorems cs.AI · 2026 · author #9
  4. What Does Flow Matching Bring To TD Learning? cs.LG · 2026 · author #3
  5. TRIM: Hybrid Inference via Targeted Stepwise Routing in Multi-Step Reasoning Tasks cs.AI · 2026 · author #6
  6. WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks cs.LG · 2026 · author #4
  7. Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities cs.CL · 2025 · author #1615
  8. Grounded Reinforcement Learning for Visual Reasoning cs.CV · 2025 · author #6
  9. Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning cs.LG · 2024 · author #9
  10. Training Language Models to Self-Correct via Reinforcement Learning cs.LG · 2024 · author #1
  11. Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters cs.LG · 2024 · author #4
  12. Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context cs.CL · 2024 · author #282
  13. Gemini: A Family of Highly Capable Multimodal Models cs.CL · 2023 · author #842
  14. Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models cs.RO · 2023 · author #6
  15. Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems cs.LG · 2020 · author #2
  16. D4RL: Datasets for Deep Data-Driven Reinforcement Learning cs.LG · 2020 · author #2
  17. Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning cs.LG · 2019 · author #2
  18. Graph Normalizing Flows cs.LG · 2019 · author #2
  19. Calibration of Encoder Decoder Models for Neural Machine Translation cs.LG · 2019 · author #1
  20. Diagnosing Bottlenecks in Deep Q-learning Algorithms cs.LG · 2019 · author #2
  21. The Reach-Avoid Problem for Constant-Rate Multi-Mode Systems cs.LO · 2017 · author #2

Mentions

  • 2606.05597 #5 · arxiv_oai · confidence 0.70 Aviral Kumar
  • 2410.08146 #9 · arxiv_oai · confidence 0.70 Aviral Kumar
  • 2505.23678 #6 · arxiv_oai · confidence 0.70 Aviral Kumar
  • 2409.12917 #1 · arxiv_oai · confidence 0.70 Aviral Kumar
  • 2310.10639 #6 · arxiv_oai · confidence 0.70 Aviral Kumar

Frequent Coauthors