pith. sign in

Mrinank Sharma

Identifiers

  • name variant Mrinank Sharma 0.60 · backfill

Papers (4)

  1. Chain-of-Thought Hijacking cs.AI · 2025 · author #4
  2. Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming cs.CL · 2025 · author #1
  3. Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training cs.CR · 2024 · author #22
  4. Towards Understanding Sycophancy in Language Models cs.CL · 2023 · author #1

Mentions

  • 2510.26418 #4 · arxiv_oai · confidence 0.70 Mrinank Sharma
  • 2501.18837 #1 · arxiv_oai · confidence 0.70 Mrinank Sharma

Frequent Coauthors