Michael Sellitto
Identifiers
No identifiers captured yet.
Papers (3)
- Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training cs.CR · 2024 · author #21
- Discovering Language Model Behaviors with Model-Written Evaluations cs.CL · 2022 · author #35
- Constitutional AI: Harmlessness from AI Feedback cs.CL · 2022 · author #27
Mentions
No mention provenance yet.
Frequent Coauthors
- Amanda Askell 3 shared papers
- Deep Ganguli 3 shared papers
- Ethan Perez 3 shared papers
- Jared Kaplan 3 shared papers
- Kamal Ndousse 3 shared papers
- Nicholas Schiefer 3 shared papers
- Nova DasSarma 3 shared papers
- Samuel R. Bowman 3 shared papers
- Shauna Kravec 3 shared papers
- Tamera Lanham 3 shared papers
- Yuntao Bai 3 shared papers
- Andy Jones 2 shared papers
- Anna Chen 2 shared papers
- Ben Mann 2 shared papers
- Cameron McKinnon 2 shared papers
- Catherine Olsson 2 shared papers
- Christopher Olah 2 shared papers
- Danny Hernandez 2 shared papers
- Dario Amodei 2 shared papers
- Dawn Drain 2 shared papers