pith. sign in

Dongyoon Hahm

Identifiers

  • name variant Dongyoon Hahm 0.60 · backfill

Papers (1)

  1. Alignment Tampering: How Reinforcement Learning from Human Feedback Is Exploited to Optimize Misaligned Biases cs.AI · 2026 · author #1

Mentions

  • 2605.27355 #1 · arxiv_oai · confidence 0.70 Dongyoon Hahm

Frequent Coauthors