DRACULA is the first dataset of user feedback on intermediate actions for deep research agents, showing that LLMs predict preferred actions better with full user history and that history-based action generation leads to higher user selection rates.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2representative citing papers
Personalized deep research systems need evaluation with real users because LLM judges overlook nuanced errors that matter to researchers.
citing papers explorer
-
DRACULA: Hunting for the Actions Users Want Deep Research Agents to Execute
DRACULA is the first dataset of user feedback on intermediate actions for deep research agents, showing that LLMs predict preferred actions better with full user history and that history-based action generation leads to higher user selection rates.
-
Language Models Don't Know What You Want: Evaluating Personalization in Deep Research Needs Real Users
Personalized deep research systems need evaluation with real users because LLM judges overlook nuanced errors that matter to researchers.