pith. machine review for the scientific record. sign in

Alexandra Souly

Identifiers

  • name variant Alexandra Souly 0.60 · backfill

Papers (4)

  1. Evaluating whether AI models would sabotage AI safety research cs.AI · 2026 · author #2
  2. Seven simple steps for log analysis in AI systems cs.AI · 2026 · author #5
  3. AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents cs.LG · 2024 · author #2
  4. A StrongREJECT for Empty Jailbreaks cs.LG · 2024 · author #1

Mentions

  • 2402.10260 #1 · arxiv_oai · confidence 0.70 Alexandra Souly

Frequent Coauthors