pith. sign in

Kaan Ozkara

Identifiers

  • name variant Kaan Ozkara 0.60 · backfill

Papers (1)

  1. Directional Alignment Mitigates Reward Hacking in Reinforcement Learning for Language Models cs.LG · 2026 · author #3

Mentions

  • 2605.25189 #3 · arxiv_oai · confidence 0.70 Kaan Ozkara

Frequent Coauthors