pith. sign in

John Schulman

Identifiers

  • name variant John Schulman 0.60 · backfill

Papers (37)

  1. Reasoning Models Don't Always Say What They Think cs.CL · 2025 · author #6
  2. Measuring short-form factuality in large language models cs.CL · 2024 · author #7
  3. GPT-4o System Card cs.CL · 2024 · author #193
  4. Let's Verify Step by Step cs.LG · 2023 · author #8
  5. GPT-4 Technical Report cs.CL · 2023 · author #215
  6. Scaling Laws for Reward Model Overoptimization cs.LG · 2022 · author #2
  7. Efficient Training of Language Models to Fill in the Middle cs.CL · 2022 · author #4
  8. Training language models to follow instructions with human feedback cs.CL · 2022 · author #11
  9. WebGPT: Browser-assisted question-answering with human feedback cs.CL · 2021 · author #18
  10. Training Verifiers to Solve Math Word Problems cs.LG · 2021 · author #12
  11. Unsolved Problems in ML Safety cs.LG · 2021 · author #3
  12. Scaling Laws for Autoregressive Generative Modeling cs.LG · 2020 · author #17
  13. Policy Gradient Search: Online Planning and Expert Iteration without Search Trees cs.LG · 2019 · author #5
  14. Semi-Supervised Learning by Label Gradient Alignment cs.LG · 2019 · author #2
  15. Quantifying Generalization in Reinforcement Learning cs.LG · 2018 · author #5
  16. Model-Based Reinforcement Learning via Meta-Policy Optimization cs.LG · 2018 · author #3
  17. Gotta Learn Fast: A New Benchmark for Generalization in RL cs.LG · 2018 · author #5
  18. On First-Order Meta-Learning Algorithms cs.LG · 2018 · author #3
  19. Meta Learning Shared Hierarchies cs.LG · 2017 · author #5
  20. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations cs.LG · 2017 · author #5
  21. Proximal Policy Optimization Algorithms cs.LG · 2017 · author #1
  22. Teacher-Student Curriculum Learning cs.LG · 2017 · author #4
  23. UCB Exploration via Q-Ensembles cs.LG · 2017 · author #4
  24. Equivalence Between Policy Gradients and Soft Q-Learning cs.LG · 2017 · author #1
  25. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning cs.AI · 2016 · author #7
  26. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning cs.AI · 2016 · author #2
  27. Variational Lossy Autoencoder cs.LG · 2016 · author #6
  28. Concrete Problems in AI Safety cs.AI · 2016 · author #5
  29. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets cs.LG · 2016 · author #4
  30. OpenAI Gym cs.LG · 2016 · author #5
  31. VIME: Variational Information Maximizing Exploration cs.LG · 2016 · author #4
  32. Kahler-Einstein and Kahler scalar flat supermanifolds hep-th · 2016 · author #3
  33. Theano: A Python framework for fast computation of mathematical expressions cs.SC · 2016 · author #89
  34. Benchmarking Deep Reinforcement Learning for Continuous Control cs.LG · 2016 · author #4
  35. Gradient Estimation Using Stochastic Computation Graphs cs.LG · 2015 · author #1
  36. High-Dimensional Continuous Control Using Generalized Advantage Estimation cs.LG · 2015 · author #1
  37. Trust Region Policy Optimization cs.LG · 2015 · author #1

Mentions

  • 1502.05477 #1 · backfill · confidence 0.70 John Schulman
  • 2210.10760 #2 · arxiv_oai · confidence 0.70 John Schulman
  • 2207.14255 #4 · arxiv_oai · confidence 0.70 John Schulman
  • 2109.13916 #3 · arxiv_oai · confidence 0.70 John Schulman
  • 2303.08774 #215 · backfill · confidence 0.70 John Schulman

Frequent Coauthors