Jacob Steinhardt — Pith Author Registry

Identifiers

name variant Jacob Steinhardt 0.60 · backfill

Papers (33)

Log analysis is necessary for credible evaluation of AI agents cs.AI · 2026 · author #10
ADAG: Automatically Describing Attribution Graphs cs.CL · 2026 · author #3
Jailbroken: How Does LLM Safety Training Fail? cs.LG · 2023 · author #3
Eliciting Latent Predictions from Transformers with the Tuned Lens cs.LG · 2023 · author #8
Progress measures for grokking via mechanistic interpretability cs.LG · 2023 · author #5
Discovering Latent Knowledge in Language Models Without Supervision cs.CL · 2022 · author #4
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small cs.LG · 2022 · author #5
The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models cs.LG · 2022 · author #3
Unsolved Problems in ML Safety cs.LG · 2021 · author #4
Measuring Coding Challenge Competence With APPS cs.SE · 2021 · author #11
Measuring Mathematical Problem Solving With the MATH Dataset cs.LG · 2021 · author #8
Measuring Massive Multitask Language Understanding cs.CY · 2020 · author #7
Aligning AI With Shared Human Values cs.CY · 2020 · author #7
Transfer of Adversarial Robustness Between Perturbation Types cs.LG · 2019 · author #5
FrAngel: Component-Based Synthesis with Control Structures cs.PL · 2018 · author #2
Semidefinite relaxations for certifying robustness to adversarial examples cs.LG · 2018 · author #2
Troubling Trends in Machine Learning Scholarship stat.ML · 2018 · author #2
Sever: A Robust Meta-Algorithm for Stochastic Optimization cs.LG · 2018 · author #5
Better Agnostic Clustering Via Relaxed Tensor Norms cs.LG · 2017 · author #2
Certified Defenses for Data Poisoning Attacks cs.LG · 2017 · author #1
Does robustness imply tractability? A lower bound for planted clique in the semi-random model cs.CC · 2017 · author #1
Resilience: A Criterion for Learning in the Presence of Arbitrary Outliers cs.LG · 2017 · author #1
Learning from Untrusted Data cs.LG · 2016 · author #2
Concrete Problems in AI Safety cs.AI · 2016 · author #3
Avoiding Imposters and Delinquents: Adversarial Crowdsourcing and Peer Prediction cs.HC · 2016 · author #1
Unsupervised Risk Estimation Using Only Conditional Independence Structure cs.LG · 2016 · author #1
Learning Fast-Mixing Models for Structured Prediction cs.LG · 2015 · author #1
Reified Context Models cs.LG · 2015 · author #1
The Statistics of Streaming Sparse Regression math.ST · 2014 · author #1
Permutations with Ascending and Descending Blocks math.CO · 2009 · author #1
Derangements with Ascending and Descending Blocks math.CO · 2009 · author #1
On Coloring the Odd-Distance Graph math.CO · 2009 · author #1
Cayley graphs formed by conjugate generating sets of S_n math.CO · 2007 · author #1

Mentions

2008.02275 #7 · arxiv_oai · confidence 0.70 Jacob Steinhardt
2201.03544 #3 · arxiv_oai · confidence 0.70 Jacob Steinhardt
2109.13916 #4 · arxiv_oai · confidence 0.70 Jacob Steinhardt
0908.4347 #1 · backfill · confidence 0.70 Jacob Steinhardt
0908.3330 #1 · backfill · confidence 0.70 Jacob Steinhardt
0908.1452 #1 · backfill · confidence 0.70 Jacob Steinhardt
2212.03827 #4 · arxiv_oai · confidence 0.70 Jacob Steinhardt
0711.3057 #1 · backfill · confidence 0.70 Jacob Steinhardt

Frequent Coauthors

Percy Liang 7 shared papers
Dan Hendrycks 6 shared papers
Collin Burns 5 shared papers
Dawn Song 4 shared papers
Steven Basart 4 shared papers
Gregory Valiant 3 shared papers
Moses Charikar 3 shared papers
Akul Arora 2 shared papers
Jerry Li 2 shared papers
John Schulman 2 shared papers
Mantas Mazeika 2 shared papers
Saurav Kadavath 2 shared papers
Aditi Raghunathan 1 shared papers
Alexander Pan 1 shared papers
Alexander Wei 1 shared papers
Alexandre Variengien 1 shared papers
Alistair Stewart 1 shared papers
Andrew Critch 1 shared papers
Andy Zou 1 shared papers
Arthur Conmy 1 shared papers