pith. sign in

William Saunders

Identifiers

  • name variant William Saunders 0.60 · backfill

Papers (8)

  1. Emotion Concepts and their Function in a Large Language Model cs.AI · 2026 · author #3
  2. Open Problems in Mechanistic Interpretability cs.LG · 2025 · author #22
  3. Self-critiquing models for assisting human evaluators cs.CL · 2022 · author #1
  4. Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models cs.CL · 2022 · author #425
  5. WebGPT: Browser-assisted question-answering with human feedback cs.CL · 2021 · author #10
  6. Evaluating Large Language Models Trained on Code cs.LG · 2021 · author #41
  7. Trial without Error: Towards Safe Reinforcement Learning via Human Intervention cs.AI · 2017 · author #1
  8. The 6dF Galaxy Survey astro-ph · 2003 · author #5

Mentions

  • 2206.05802 #1 · arxiv_oai · confidence 0.70 William Saunders

Frequent Coauthors