pith. sign in

Fabien Roger

Identifiers

  • name variant Fabien Roger 0.60 · backfill

Papers (7)

  1. SLEIGHT-Bench: A Benchmark of Evasion Attacks Against Agent Monitors cs.CR · 2026 · author #4
  2. Classifier Context Rot: Monitor Performance Degrades with Context Length cs.AI · 2026 · author #2
  3. How Useful Is Cross-Domain Generalization for Training LLM Monitors? cs.AI · 2026 · author #2
  4. Narrow Secret Loyalty Dodges Black-Box Audits cs.CR · 2026 · author #2
  5. Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety cs.AI · 2025 · author #32
  6. Reasoning Models Don't Always Say What They Think cs.CL · 2025 · author #10
  7. Alignment faking in large language models cs.AI · 2024 · author #4

Mentions

  • 2507.11473 #32 · arxiv_oai · confidence 0.70 Fabien Roger
  • 2605.16626 #4 · arxiv_oai · confidence 0.70 Fabien Roger

Frequent Coauthors