Jared Kaplan

Identifiers

name variant Jared Kaplan 0.60 · backfill

Papers (70)

Reasoning Models Don't Always Say What They Think cs.CL · 2025 · author #14
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming cs.CL · 2025 · author #42
Alignment faking in large language models cs.AI · 2024 · author #17
Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models cs.AI · 2024 · author #10
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training cs.CR · 2024 · author #34
Measuring Faithfulness in Chain-of-Thought Reasoning cs.AI · 2023 · author #27
Towards Measuring the Representation of Subjective Global Opinions in Language Models cs.CL · 2023 · author #16
Discovering Language Model Behaviors with Model-Written Evaluations cs.CL · 2022 · author #63
Constitutional AI: Harmlessness from AI Feedback cs.CL · 2022 · author #51
Measuring Progress on Scalable Oversight for Large Language Models cs.HC · 2022 · author #46
In-context Learning and Induction Heads cs.LG · 2022 · author #24
Toy Models of Superposition cs.LG · 2022 · author #13
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned cs.CL · 2022 · author #35
Language Models (Mostly) Know What They Know cs.CL · 2022 · author #36
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models cs.CL · 2022 · author #180
Scaling Laws and Interpretability of Learning from Repeated Data cs.LG · 2022 · author #17
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback cs.CL · 2022 · author #31
A General Language Assistant as a Laboratory for Alignment cs.CL · 2021 · author #22
Evaluating Large Language Models Trained on Code cs.LG · 2021 · author #6
Scaling Laws for Transfer cs.LG · 2021 · author #2
Scaling Laws for Autoregressive Generative Modeling cs.LG · 2020 · author #2
Language Models are Few-Shot Learners cs.CL · 2020 · author #5
Scaling Laws for Neural Language Models cs.LG · 2020 · author #1
An Empirical Model of Large-Batch Training cs.LG · 2018 · author #2
Lightcone Effective Hamiltonians and RG Flows hep-th · 2018 · author #2
The AdS$_3$ Propagator and the Fate of Locality hep-th · 2017 · author #3
An Exact Operator That Knows Its Location hep-th · 2017 · author #4
A Numerical Approach to Virasoro Blocks and the Information Paradox hep-th · 2017 · author #3
Exact Virasoro Blocks from Wilson Lines and Background-Independent Operators hep-th · 2016 · author #2
On the Late-Time Behavior of Virasoro Blocks and a Classification of Semiclassical Saddles hep-th · 2016 · author #2
Degenerate Operators and the $1/c$ Expansion: Lorentzian Resummations, High Order Computations, and Super-Virasoro Blocks hep-th · 2016 · author #3
On Information Loss in AdS$_3$/CFT$_2$ hep-th · 2016 · author #2
A Quantum Correction To Chaos hep-th · 2016 · author #2
Conformal Blocks Beyond the Semi-Classical Limit hep-th · 2015 · author #2
Hawking from Catalan hep-th · 2015 · author #2
Eikonalization of Conformal Blocks hep-th · 2015 · author #2
Virasoro Conformal Blocks and Thermality from Classical Background Fields hep-th · 2015 · author #2
Enhanced Pairing of Quantum Critical Metals Near d=3+1 cond-mat.str-el · 2014 · author #3
An Effective Theory for Holographic RG Flows hep-th · 2014 · author #1
Universality of Long-Distance AdS Physics from the CFT Bootstrap hep-th · 2014 · author #2
Slow Fermions in Quantum Critical Metals cond-mat.str-el · 2014 · author #3
Covariant Approaches to Superconformal Blocks hep-th · 2014 · author #2
Non-Fermi liquid behavior of large N_B quantum critical metals cond-mat.str-el · 2013 · author #3
Non-Fermi liquid fixed point in a Wilsonian theory of quantum critical metals cond-mat.str-el · 2013 · author #3
Conformal Blocks in the Large D Limit hep-th · 2013 · author #2
Decoupling of High Dimension Operators from the Low Energy Sector in Holographic Models hep-th · 2013 · author #2
The Analytic Bootstrap and AdS Superhorizon Locality hep-th · 2012 · author #2
AdS Field Theory from Conformal Field Theory hep-th · 2012 · author #2
A New Theory of Anyons hep-th · 2012 · author #3
Unitarity and the Holographic S-Matrix hep-th · 2011 · author #2
Analyticity and the Holographic S-Matrix hep-th · 2011 · author #2
Heavy Flavor Simplified Models at the LHC hep-ph · 2011 · author #3
A Natural Language for AdS/CFT Correlators hep-th · 2011 · author #2
Simplified Models for LHC New Physics Searches hep-ph · 2011 · author #36
Scattering States in AdS/CFT hep-th · 2011 · author #2
LHC Predictions from a Tevatron Anomaly in the Top Quark Forward-Backward Asymmetry hep-ph · 2011 · author #3
Discovering New Light States at Neutrino Experiments hep-ph · 2010 · author #3
On the Origin of Light Dark Matter Species hep-ph · 2010 · author #2
Unraveling L_{n,k}: Grassmannian Kinematics hep-th · 2009 · author #1
A Duality For The S Matrix hep-th · 2009 · author #4
The S-Matrix in Twistor Space hep-th · 2009 · author #4
What is the Simplest Quantum Field Theory? hep-th · 2008 · author #3
On Tree Amplitudes in Gauge Theory and Gravity hep-th · 2008 · author #2
On the consistency relation of the 3-point function in single field inflation hep-th · 2007 · author #3
The Plasma Puddle as a Perturbative Black Hole hep-th · 2007 · author #2
Searching for the Kaluza-Klein Graviton in Bulk RS Models hep-ph · 2007 · author #2
Avoiding an Empty Universe in RS I Models and Large-N Gauge Theories hep-ph · 2006 · author #1
Dark Matter Generation and Split Supersymmetry hep-ph · 2006 · author #1
The Fall 2004 SDSS Supernova Survey astro-ph · 2005 · author #11
Extracting Data from Behind Horizons with the AdS/CFT Correspondence hep-th · 2004 · author #1

Mentions

1510.00014 #2 · backfill · confidence 0.70 Jared Kaplan
1504.01737 #2 · backfill · confidence 0.70 Jared Kaplan
1501.05315 #2 · backfill · confidence 0.70 Jared Kaplan
1410.6814 #3 · backfill · confidence 0.70 Jared Kaplan
1406.4152 #1 · backfill · confidence 0.70 Jared Kaplan
1403.6829 #2 · backfill · confidence 0.70 Jared Kaplan
1402.5413 #3 · backfill · confidence 0.70 Jared Kaplan
1402.1167 #2 · backfill · confidence 0.70 Jared Kaplan
1312.3321 #3 · backfill · confidence 0.70 Jared Kaplan
1307.0004 #3 · backfill · confidence 0.70 Jared Kaplan
1305.0004 #2 · backfill · confidence 0.70 Jared Kaplan
1304.3458 #2 · backfill · confidence 0.70 Jared Kaplan
1212.3616 #2 · backfill · confidence 0.70 Jared Kaplan
1208.0337 #2 · backfill · confidence 0.70 Jared Kaplan
1205.6816 #3 · backfill · confidence 0.70 Jared Kaplan
1112.4845 #2 · backfill · confidence 0.70 Jared Kaplan
1111.6972 #2 · backfill · confidence 0.70 Jared Kaplan
1110.6443 #3 · backfill · confidence 0.70 Jared Kaplan
1107.1499 #2 · backfill · confidence 0.70 Jared Kaplan
1105.2838 #36 · backfill · confidence 0.70 Jared Kaplan
1104.2597 #2 · backfill · confidence 0.70 Jared Kaplan
1101.5203 #3 · backfill · confidence 0.70 Jared Kaplan
1008.0636 #3 · backfill · confidence 0.70 Jared Kaplan
2102.01293 #2 · arxiv_oai · confidence 0.70 Jared Kaplan
2501.18837 #42 · arxiv_oai · confidence 0.70 Jared Kaplan
1004.0691 #2 · backfill · confidence 0.70 Jared Kaplan
2205.10487 #17 · arxiv_oai · confidence 0.70 Jared Kaplan
2211.03540 #46 · arxiv_oai · confidence 0.70 Jared Kaplan
2406.10162 #10 · arxiv_oai · confidence 0.70 Jared Kaplan
0912.0957 #1 · backfill · confidence 0.70 Jared Kaplan
0907.5418 #4 · backfill · confidence 0.70 Jared Kaplan
2306.16388 #16 · arxiv_oai · confidence 0.70 Jared Kaplan
0903.2110 #4 · backfill · confidence 0.70 Jared Kaplan
0808.1446 #3 · backfill · confidence 0.70 Jared Kaplan
0801.2385 #2 · backfill · confidence 0.70 Jared Kaplan
0709.0295 #3 · backfill · confidence 0.70 Jared Kaplan
0704.1146 #2 · backfill · confidence 0.70 Jared Kaplan

Frequent Coauthors

A. Liam Fitzpatrick 29 shared papers
Sam McCandlish 18 shared papers
Dario Amodei 15 shared papers
Tom Henighan 14 shared papers
Amanda Askell 13 shared papers
Danny Hernandez 12 shared papers
Nicholas Joseph 12 shared papers
Zac Hatfield-Dodds 12 shared papers
Ethan Perez 11 shared papers
Catherine Olsson 10 shared papers
Dawn Drain 10 shared papers
Deep Ganguli 10 shared papers
Jackson Kernion 10 shared papers
Nelson Elhage 10 shared papers
Nicholas Schiefer 10 shared papers
Nova DasSarma 10 shared papers
Samuel R. Bowman 10 shared papers
Yuntao Bai 10 shared papers
Anna Chen 9 shared papers
Ben Mann 9 shared papers