Whose Name Comes Up? II: Benchmarking and Intervention-Based Auditing of LLM-Based Scholar Recommendation

· 2026 · cs.IR · arXiv 2602.08873

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Large language models (LLMs) are now used for academic expert recommendation. Existing audits typically evaluate such recommendations in isolation, ignoring end-user inference-time interventions. Thus, it remains unclear whether failures (e.g., refusals, hallucinations, uneven coverage) stem from model choice or deployment decisions. We introduce LLMScholarBench, a benchmark for auditing LLM-based scholar recommendation that jointly evaluates model infrastructure and end-user interventions across multiple tasks. LLMScholarBench measures technical quality and social representation using nine metrics. We instantiate the benchmark in physics expert recommendation and audit 22 LLMs under temperature variation, representation-constrained prompting, and retrieval-augmented generation (RAG) via web search. Our results show that each intervention entails distinct tradeoffs. Higher temperature degrades validity, consistency, and factuality. Representation-constrained prompting improves diversity at the expense of factuality, while RAG primarily improves technical quality while reducing diversity and parity. Overall, end-user interventions reshape trade-offs rather than providing uniform gains. LLMScholarBench makes all these dynamics auditable across models and interventions in LLM-based scholar recommendations.

representative citing papers

Whose Name Comes Up? III: Persona Prompting Effects in LLM-Based Scholar Recommendation

cs.IR · 2026-05-27 · unverdicted · novelty 6.0

Audits of 43 LLMs show that varying persona prompts (language, location, role-and-task) and context affects technical quality and social representativeness of scholar recommendations, with location impacting diversity and factuality.

citing papers explorer

Showing 1 of 1 citing paper.

Whose Name Comes Up? III: Persona Prompting Effects in LLM-Based Scholar Recommendation cs.IR · 2026-05-27 · unverdicted · none · ref 13 · internal anchor
Audits of 43 LLMs show that varying persona prompts (language, location, role-and-task) and context affects technical quality and social representativeness of scholar recommendations, with location impacting diversity and factuality.

Whose Name Comes Up? II: Benchmarking and Intervention-Based Auditing of LLM-Based Scholar Recommendation

fields

years

verdicts

representative citing papers

citing papers explorer