InProceedings of the 63rd Annual Meeting of the Association for Computational Lin- guistics (Volume 3: System Demonstrations), pages 513–523, Vienna, Austria

Ai2 scholar QA: Organized literature synthesis with attribution

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.CL · 2026-03-17 · conditional · novelty 7.0

Personalized deep research systems need evaluation with real users because LLM judges overlook nuanced errors that matter to researchers.

Showing 1 of 1 citing paper.

Language Models Don't Know What You Want: Evaluating Personalization in Deep Research Needs Real Users cs.CL · 2026-03-17 · conditional · none · ref 4
Personalized deep research systems need evaluation with real users because LLM judges overlook nuanced errors that matter to researchers.