pith. sign in

Josef Dai

Identifiers

  • name variant Josef Dai 0.60 · backfill

Papers (3)

  1. Debate with Images: Detecting Deceptive Behaviors in Multimodal Large Language Models cs.AI · 2025 · author #7
  2. SafeVLA: Towards Safety Alignment of Vision-Language-Action Model via Constrained Learning cs.RO · 2025 · author #6
  3. Safe RLHF: Safe Reinforcement Learning from Human Feedback cs.AI · 2023 · author #1

Mentions

  • 2512.00349 #7 · arxiv_oai · confidence 0.70 Josef Dai

Frequent Coauthors