pith. sign in

Hoagy Cunningham

Identifiers

  • name variant Hoagy Cunningham 0.60 · backfill

Papers (4)

  1. Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet cs.AI · 2026 · author #11
  2. Segment-Level Coherence for Robust Harmful Intent Probing in LLMs cs.CL · 2026 · author #5
  3. Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming cs.CL · 2025 · author #17
  4. Sparse Autoencoders Find Highly Interpretable Features in Language Models cs.LG · 2023 · author #1

Mentions

  • 2605.29358 #11 · arxiv_oai · confidence 0.70 Hoagy Cunningham
  • 2501.18837 #17 · arxiv_oai · confidence 0.70 Hoagy Cunningham

Frequent Coauthors