LAB-Bench provides over 2,400 multiple-choice questions to measure LLM performance on real biology research tasks like literature recall, figure reading, database access, and sequence manipulation, with initial results compared against human expert biologists.
BioASQ-QA: A manually curated corpus for biomedical question answering
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2024 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
LAB-Bench: Measuring Capabilities of Language Models for Biology Research
LAB-Bench provides over 2,400 multiple-choice questions to measure LLM performance on real biology research tasks like literature recall, figure reading, database access, and sequence manipulation, with initial results compared against human expert biologists.