pith. sign in

Long Phan

Identifiers

  • name variant Long Phan 0.60 · backfill

Papers (6)

  1. Reducing Political Manipulation with Consistency Training cs.CL · 2026 · author #1
  2. Humanity's Last Exam cs.LG · 2025 · author #1
  3. The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning cs.LG · 2024 · author #10
  4. HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal cs.LG · 2024 · author #2
  5. Representation Engineering: A Top-Down Approach to AI Transparency cs.LG · 2023 · author #2
  6. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model cs.CL · 2022 · author #78

Mentions

  • 2605.22771 #1 · arxiv_oai · confidence 0.70 Long Phan
  • 2403.03218 #10 · arxiv_oai · confidence 0.70 Long Phan

Frequent Coauthors