pith. sign in

Asaf Yehudai

Identifiers

  • name variant Asaf Yehudai 0.60 · backfill

Papers (11)

  1. Every Eval Ever: A Unifying Schema and Community Repository for AI Evaluation Results cs.AI · 2026 · author #10
  2. Evaluation Cards: An Interpretive Layer for AI Evaluation Reporting cs.AI · 2026 · author #38
  3. Teaching Values to Machines: Simulating Human-Like Behavior in LLMs cs.AI · 2026 · author #1
  4. A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks cs.AI · 2026 · author #3
  5. Agentic CLEAR: Automating Multi-Level Evaluation of LLM Agents cs.CL · 2026 · author #1
  6. Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents? cs.CL · 2026 · author #4
  7. Growing Pains: Extensible and Efficient LLM Benchmarking Via Fixed Parameter Calibration cs.CL · 2026 · author #3
  8. General Agent Evaluation cs.AI · 2026 · author #2
  9. Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization cs.CL · 2025 · author #2
  10. Survey on Evaluation of LLM-based Agents cs.AI · 2025 · author #1
  11. WildIFEval: Instruction Following in the Wild cs.CL · 2025 · author #2

Mentions

  • 2606.14516 #10 · arxiv_oai · confidence 0.70 Asaf Yehudai
  • 2503.06573 #2 · arxiv_oai · confidence 0.70 Asaf Yehudai
  • 2606.09809 #38 · arxiv_oai · confidence 0.70 Asaf Yehudai
  • 2605.30036 #1 · arxiv_oai · confidence 0.70 Asaf Yehudai
  • 2605.28556 #3 · arxiv_oai · confidence 0.70 Asaf Yehudai
  • 2605.22608 #1 · arxiv_oai · confidence 0.70 Asaf Yehudai
  • 2605.19196 #4 · arxiv_oai · confidence 0.70 Asaf Yehudai

Frequent Coauthors