Jacob Steinhardt
Identifiers
- name variant Jacob Steinhardt 0.60 · backfill
Papers (33)
- Log analysis is necessary for credible evaluation of AI agents cs.AI · 2026 · author #10
- ADAG: Automatically Describing Attribution Graphs cs.CL · 2026 · author #3
- Jailbroken: How Does LLM Safety Training Fail? cs.LG · 2023 · author #3
- Eliciting Latent Predictions from Transformers with the Tuned Lens cs.LG · 2023 · author #8
- Progress measures for grokking via mechanistic interpretability cs.LG · 2023 · author #5
- Discovering Latent Knowledge in Language Models Without Supervision cs.CL · 2022 · author #4
- Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small cs.LG · 2022 · author #5
- The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models cs.LG · 2022 · author #3
- Unsolved Problems in ML Safety cs.LG · 2021 · author #4
- Measuring Coding Challenge Competence With APPS cs.SE · 2021 · author #11
- Measuring Mathematical Problem Solving With the MATH Dataset cs.LG · 2021 · author #8
- Measuring Massive Multitask Language Understanding cs.CY · 2020 · author #7
- Aligning AI With Shared Human Values cs.CY · 2020 · author #7
- Transfer of Adversarial Robustness Between Perturbation Types cs.LG · 2019 · author #5
- FrAngel: Component-Based Synthesis with Control Structures cs.PL · 2018 · author #2
- Semidefinite relaxations for certifying robustness to adversarial examples cs.LG · 2018 · author #2
- Troubling Trends in Machine Learning Scholarship stat.ML · 2018 · author #2
- Sever: A Robust Meta-Algorithm for Stochastic Optimization cs.LG · 2018 · author #5
- Better Agnostic Clustering Via Relaxed Tensor Norms cs.LG · 2017 · author #2
- Certified Defenses for Data Poisoning Attacks cs.LG · 2017 · author #1
- Does robustness imply tractability? A lower bound for planted clique in the semi-random model cs.CC · 2017 · author #1
- Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers cs.LG · 2017 · author #1
- Learning from Untrusted Data cs.LG · 2016 · author #2
- Concrete Problems in AI Safety cs.AI · 2016 · author #3
- Avoiding Imposters and Delinquents: Adversarial Crowdsourcing and Peer Prediction cs.HC · 2016 · author #1
- Unsupervised Risk Estimation Using Only Conditional Independence Structure cs.LG · 2016 · author #1
- Learning Fast-Mixing Models for Structured Prediction cs.LG · 2015 · author #1
- Reified Context Models cs.LG · 2015 · author #1
- The Statistics of Streaming Sparse Regression math.ST · 2014 · author #1
- Permutations with Ascending and Descending Blocks math.CO · 2009 · author #1
- Derangements with Ascending and Descending Blocks math.CO · 2009 · author #1
- On Coloring the Odd-Distance Graph math.CO · 2009 · author #1
- Cayley graphs formed by conjugate generating sets of S_n math.CO · 2007 · author #1
Mentions
- 2008.02275 #7 · arxiv_oai · confidence 0.70 Jacob Steinhardt
- 2201.03544 #3 · arxiv_oai · confidence 0.70 Jacob Steinhardt
- 2109.13916 #4 · arxiv_oai · confidence 0.70 Jacob Steinhardt
- 0908.4347 #1 · backfill · confidence 0.70 Jacob Steinhardt
- 0908.3330 #1 · backfill · confidence 0.70 Jacob Steinhardt
- 0908.1452 #1 · backfill · confidence 0.70 Jacob Steinhardt
- 2212.03827 #4 · arxiv_oai · confidence 0.70 Jacob Steinhardt
- 0711.3057 #1 · backfill · confidence 0.70 Jacob Steinhardt
Frequent Coauthors
- Percy Liang 7 shared papers
- Dan Hendrycks 6 shared papers
- Collin Burns 5 shared papers
- Dawn Song 4 shared papers
- Steven Basart 4 shared papers
- Gregory Valiant 3 shared papers
- Moses Charikar 3 shared papers
- Akul Arora 2 shared papers
- Jerry Li 2 shared papers
- John Schulman 2 shared papers
- Mantas Mazeika 2 shared papers
- Saurav Kadavath 2 shared papers
- Aditi Raghunathan 1 shared papers
- Alexander Pan 1 shared papers
- Alexander Wei 1 shared papers
- Alexandre Variengien 1 shared papers
- Alistair Stewart 1 shared papers
- Andrew Critch 1 shared papers
- Andy Zou 1 shared papers
- Arthur Conmy 1 shared papers