pith. sign in

Aryaman Arora

Identifiers

  • name variant Aryaman Arora 0.60 · backfill

Papers (6)

  1. The Piggyback Hypothesis of Generalization: Explaining and Mitigating Emergent Misalignment cs.CL · 2026 · author #3
  2. PreFT: Prefill-only finetuning for efficient inference cs.LG · 2026 · author #2
  3. ADAG: Automatically Describing Attribution Graphs cs.CL · 2026 · author #1
  4. Verbalizing LLMs' assumptions to explain and control sycophancy cs.CL · 2026 · author #6
  5. Language Model Circuits Are Sparse in the Neuron Basis cs.CL · 2026 · author #1
  6. Localizing Model Behavior with Path Patching cs.LG · 2023 · author #4

Mentions

  • 2601.22594 #1 · arxiv_oai · confidence 0.70 Aryaman Arora
  • 2606.06667 #3 · arxiv_oai · confidence 0.70 Aryaman Arora
  • 2304.05969 #4 · arxiv_oai · confidence 0.70 Aryaman Arora

Frequent Coauthors