Alexander Pan
Identifiers
- name variant Alexander Pan 0.60 · backfill
Papers (4)
- Reducing Political Manipulation with Consistency Training cs.CL · 2026 · author #3
- The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning cs.LG · 2024 · author #2
- Representation Engineering: A Top-Down Approach to AI Transparency cs.LG · 2023 · author #7
- The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models cs.LG · 2022 · author #1
Mentions
- 2605.22771 #3 · arxiv_oai · confidence 0.70 Alexander Pan
- 2201.03544 #1 · arxiv_oai · confidence 0.70 Alexander Pan
- 2403.03218 #2 · arxiv_oai · confidence 0.70 Alexander Pan
Frequent Coauthors
- Dan Hendrycks 3 shared papers
- Long Phan 3 shared papers
- Adam Khoja 2 shared papers
- Andy Zou 2 shared papers
- Ann-Kathrin Dombrowski 2 shared papers
- Mantas Mazeika 2 shared papers
- Nathaniel Li 2 shared papers
- Shashwat Goel 2 shared papers
- Steven Basart 2 shared papers
- Zifan Wang 2 shared papers
- Adam A. Hunt 1 shared papers
- Alexandr Wang 1 shared papers
- Alex Levinson 1 shared papers
- Alex Mallen 1 shared papers
- Alice Blair 1 shared papers
- Alice Gatti 1 shared papers
- Andrew B. Liu 1 shared papers
- Anjali Gopal 1 shared papers
- Ariel Herbert-Voss 1 shared papers
- Bhrugu Bharathi 1 shared papers