pith. sign in

Jared Kaplan

Identifiers

  • name variant Jared Kaplan 0.60 · backfill

Papers (70)

  1. Reasoning Models Don't Always Say What They Think cs.CL · 2025 · author #14
  2. Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming cs.CL · 2025 · author #42
  3. Alignment faking in large language models cs.AI · 2024 · author #17
  4. Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models cs.AI · 2024 · author #10
  5. Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training cs.CR · 2024 · author #34
  6. Measuring Faithfulness in Chain-of-Thought Reasoning cs.AI · 2023 · author #27
  7. Towards Measuring the Representation of Subjective Global Opinions in Language Models cs.CL · 2023 · author #16
  8. Discovering Language Model Behaviors with Model-Written Evaluations cs.CL · 2022 · author #63
  9. Constitutional AI: Harmlessness from AI Feedback cs.CL · 2022 · author #51
  10. Measuring Progress on Scalable Oversight for Large Language Models cs.HC · 2022 · author #46
  11. In-context Learning and Induction Heads cs.LG · 2022 · author #24
  12. Toy Models of Superposition cs.LG · 2022 · author #13
  13. Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned cs.CL · 2022 · author #35
  14. Language Models (Mostly) Know What They Know cs.CL · 2022 · author #36
  15. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models cs.CL · 2022 · author #180
  16. Scaling Laws and Interpretability of Learning from Repeated Data cs.LG · 2022 · author #17
  17. Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback cs.CL · 2022 · author #31
  18. A General Language Assistant as a Laboratory for Alignment cs.CL · 2021 · author #22
  19. Evaluating Large Language Models Trained on Code cs.LG · 2021 · author #6
  20. Scaling Laws for Transfer cs.LG · 2021 · author #2
  21. Scaling Laws for Autoregressive Generative Modeling cs.LG · 2020 · author #2
  22. Language Models are Few-Shot Learners cs.CL · 2020 · author #5
  23. Scaling Laws for Neural Language Models cs.LG · 2020 · author #1
  24. An Empirical Model of Large-Batch Training cs.LG · 2018 · author #2
  25. Lightcone Effective Hamiltonians and RG Flows hep-th · 2018 · author #2
  26. The AdS$_3$ Propagator and the Fate of Locality hep-th · 2017 · author #3
  27. An Exact Operator That Knows Its Location hep-th · 2017 · author #4
  28. A Numerical Approach to Virasoro Blocks and the Information Paradox hep-th · 2017 · author #3
  29. Exact Virasoro Blocks from Wilson Lines and Background-Independent Operators hep-th · 2016 · author #2
  30. On the Late-Time Behavior of Virasoro Blocks and a Classification of Semiclassical Saddles hep-th · 2016 · author #2
  31. Degenerate Operators and the $1/c$ Expansion: Lorentzian Resummations, High Order Computations, and Super-Virasoro Blocks hep-th · 2016 · author #3
  32. On Information Loss in AdS$_3$/CFT$_2$ hep-th · 2016 · author #2
  33. A Quantum Correction To Chaos hep-th · 2016 · author #2
  34. Conformal Blocks Beyond the Semi-Classical Limit hep-th · 2015 · author #2
  35. Hawking from Catalan hep-th · 2015 · author #2
  36. Eikonalization of Conformal Blocks hep-th · 2015 · author #2
  37. Virasoro Conformal Blocks and Thermality from Classical Background Fields hep-th · 2015 · author #2
  38. Enhanced Pairing of Quantum Critical Metals Near d=3+1 cond-mat.str-el · 2014 · author #3
  39. An Effective Theory for Holographic RG Flows hep-th · 2014 · author #1
  40. Universality of Long-Distance AdS Physics from the CFT Bootstrap hep-th · 2014 · author #2
  41. Slow Fermions in Quantum Critical Metals cond-mat.str-el · 2014 · author #3
  42. Covariant Approaches to Superconformal Blocks hep-th · 2014 · author #2
  43. Non-Fermi liquid behavior of large N_B quantum critical metals cond-mat.str-el · 2013 · author #3
  44. Non-Fermi liquid fixed point in a Wilsonian theory of quantum critical metals cond-mat.str-el · 2013 · author #3
  45. Conformal Blocks in the Large D Limit hep-th · 2013 · author #2
  46. Decoupling of High Dimension Operators from the Low Energy Sector in Holographic Models hep-th · 2013 · author #2
  47. The Analytic Bootstrap and AdS Superhorizon Locality hep-th · 2012 · author #2
  48. AdS Field Theory from Conformal Field Theory hep-th · 2012 · author #2
  49. A New Theory of Anyons hep-th · 2012 · author #3
  50. Unitarity and the Holographic S-Matrix hep-th · 2011 · author #2
  51. Analyticity and the Holographic S-Matrix hep-th · 2011 · author #2
  52. Heavy Flavor Simplified Models at the LHC hep-ph · 2011 · author #3
  53. A Natural Language for AdS/CFT Correlators hep-th · 2011 · author #2
  54. Simplified Models for LHC New Physics Searches hep-ph · 2011 · author #36
  55. Scattering States in AdS/CFT hep-th · 2011 · author #2
  56. LHC Predictions from a Tevatron Anomaly in the Top Quark Forward-Backward Asymmetry hep-ph · 2011 · author #3
  57. Discovering New Light States at Neutrino Experiments hep-ph · 2010 · author #3
  58. On the Origin of Light Dark Matter Species hep-ph · 2010 · author #2
  59. Unraveling L_{n,k}: Grassmannian Kinematics hep-th · 2009 · author #1
  60. A Duality For The S Matrix hep-th · 2009 · author #4
  61. The S-Matrix in Twistor Space hep-th · 2009 · author #4
  62. What is the Simplest Quantum Field Theory? hep-th · 2008 · author #3
  63. On Tree Amplitudes in Gauge Theory and Gravity hep-th · 2008 · author #2
  64. On the consistency relation of the 3-point function in single field inflation hep-th · 2007 · author #3
  65. The Plasma Puddle as a Perturbative Black Hole hep-th · 2007 · author #2
  66. Searching for the Kaluza-Klein Graviton in Bulk RS Models hep-ph · 2007 · author #2
  67. Avoiding an Empty Universe in RS I Models and Large-N Gauge Theories hep-ph · 2006 · author #1
  68. Dark Matter Generation and Split Supersymmetry hep-ph · 2006 · author #1
  69. The Fall 2004 SDSS Supernova Survey astro-ph · 2005 · author #11
  70. Extracting Data from Behind Horizons with the AdS/CFT Correspondence hep-th · 2004 · author #1

Mentions

  • 1510.00014 #2 · backfill · confidence 0.70 Jared Kaplan
  • 1504.01737 #2 · backfill · confidence 0.70 Jared Kaplan
  • 1501.05315 #2 · backfill · confidence 0.70 Jared Kaplan
  • 1410.6814 #3 · backfill · confidence 0.70 Jared Kaplan
  • 1406.4152 #1 · backfill · confidence 0.70 Jared Kaplan
  • 1403.6829 #2 · backfill · confidence 0.70 Jared Kaplan
  • 1402.5413 #3 · backfill · confidence 0.70 Jared Kaplan
  • 1402.1167 #2 · backfill · confidence 0.70 Jared Kaplan
  • 1312.3321 #3 · backfill · confidence 0.70 Jared Kaplan
  • 1307.0004 #3 · backfill · confidence 0.70 Jared Kaplan
  • 1305.0004 #2 · backfill · confidence 0.70 Jared Kaplan
  • 1304.3458 #2 · backfill · confidence 0.70 Jared Kaplan
  • 1212.3616 #2 · backfill · confidence 0.70 Jared Kaplan
  • 1208.0337 #2 · backfill · confidence 0.70 Jared Kaplan
  • 1205.6816 #3 · backfill · confidence 0.70 Jared Kaplan
  • 1112.4845 #2 · backfill · confidence 0.70 Jared Kaplan
  • 1111.6972 #2 · backfill · confidence 0.70 Jared Kaplan
  • 1110.6443 #3 · backfill · confidence 0.70 Jared Kaplan
  • 1107.1499 #2 · backfill · confidence 0.70 Jared Kaplan
  • 1105.2838 #36 · backfill · confidence 0.70 Jared Kaplan
  • 1104.2597 #2 · backfill · confidence 0.70 Jared Kaplan
  • 1101.5203 #3 · backfill · confidence 0.70 Jared Kaplan
  • 1008.0636 #3 · backfill · confidence 0.70 Jared Kaplan
  • 2102.01293 #2 · arxiv_oai · confidence 0.70 Jared Kaplan
  • 2501.18837 #42 · arxiv_oai · confidence 0.70 Jared Kaplan
  • 1004.0691 #2 · backfill · confidence 0.70 Jared Kaplan
  • 2205.10487 #17 · arxiv_oai · confidence 0.70 Jared Kaplan
  • 2211.03540 #46 · arxiv_oai · confidence 0.70 Jared Kaplan
  • 2406.10162 #10 · arxiv_oai · confidence 0.70 Jared Kaplan
  • 0912.0957 #1 · backfill · confidence 0.70 Jared Kaplan
  • 0907.5418 #4 · backfill · confidence 0.70 Jared Kaplan
  • 2306.16388 #16 · arxiv_oai · confidence 0.70 Jared Kaplan
  • 0903.2110 #4 · backfill · confidence 0.70 Jared Kaplan
  • 0808.1446 #3 · backfill · confidence 0.70 Jared Kaplan
  • 0801.2385 #2 · backfill · confidence 0.70 Jared Kaplan
  • 0709.0295 #3 · backfill · confidence 0.70 Jared Kaplan
  • 0704.1146 #2 · backfill · confidence 0.70 Jared Kaplan

Frequent Coauthors