MetaSyn benchmark shows LLM pipelines recover at most 52.7% of ground-truth included studies due to screening failures on PI/ECO eligibility, despite 90.9% retrieval recall at K=200.
Rethlefsen, Shona Kirtley, Siw Waffenschmidt, et al
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Derives three EVPI-based stopping policies for document screening and shows higher net utility than recall-target methods on CLEF-IP and medical review datasets.
citing papers explorer
-
Benchmarking LLM Agents on Meta-Analysis Articles from Nature Portfolio
MetaSyn benchmark shows LLM pipelines recover at most 52.7% of ground-truth included studies due to screening failures on PI/ECO eligibility, despite 90.9% retrieval recall at K=200.
-
Decision-Theoretic Stopping Rules for Document Screening
Derives three EVPI-based stopping policies for document screening and shows higher net utility than recall-target methods on CLEF-IP and medical review datasets.