Pith Number

pith:JQV66JLA

pith:2024:JQV66JLA35IBZFBTFZ3NRYLFR7

not attested not anchored not stored refs resolved

LAB-Bench: Measuring Capabilities of Language Models for Biology Research

Andrew D. White, Jon M. Laurent, Joseph D. Janizek, Manvitha Ponnapati, Michaela M. Hinks, Michael J. Hammerling, Michael Ruzo, Samuel G. Rodriques, Siddharth Narayanan

LAB-Bench introduces over 2,400 questions to test AI on practical biology research tasks such as literature search and sequence manipulation.

arxiv:2407.10362 v3 · 2024-07-14 · cs.AI

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{JQV66JLA35IBZFBTFZ3NRYLFR7}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

An AI system that can achieve consistently high scores on the more difficult LAB-Bench tasks would serve as a useful assistant for researchers in areas such as literature search and molecular cloning.

C2weakest assumption

The multiple-choice questions in LAB-Bench accurately reflect the practical capabilities required for real-world biology research tasks, rather than testing only surface-level pattern matching.

C3one line summary

LAB-Bench provides over 2,400 multiple-choice questions to measure LLM performance on real biology research tasks like literature recall, figure reading, database access, and sequence manipulation, with initial results compared against human expert biologists.

References

59 extracted · 59 resolved · 3 Pith anchors

[1] Joanna S Amberger, Carol A Bocchini, François Schiettecatte, Alan F Scott, and Ada Hamosh. Omim. org: Online mendelian inheritance in man (omim®), an online catalog of human genes and genetic disorder 2015

[2] Introducing the next generation of claude, March 2024 2024

[3] Introducing the next generation of claude, March 2024 2024

[4] Lessons from the Trenches on Reproducible Evaluation of Language Models 2024 · arXiv:2405.14782

[5] Autonomous chemical research with large language models 2023 · doi:10.1038/s41586-023-06792-0

Formal links

3 machine-checked theorem links

Cited by

26 papers in Pith

FML-bench: A Controlled Study of AI Research Agent Strategies from the Perspective of Search Dynamics

AI for Auto-Research: Roadmap & User Guide

BioXArena: Benchmarking LLM Agents on Multi-Modal Biomedical Machine Learning Tasks

LEAP: Trajectory-Level Evaluation of LLMs in Iterative Scientific Design

Benchmarking Misuse Mitigation Against Covert Adversaries

Receipt and verification

First computed	2026-05-17T23:38:47.379162Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

4c2bef2560df501c94332e76d8e1658fe77d63b751508f4118a5d4e623f63c80

Aliases

arxiv: 2407.10362 · arxiv_version: 2407.10362v3 · doi: 10.48550/arxiv.2407.10362 · pith_short_12: JQV66JLA35IB · pith_short_16: JQV66JLA35IBZFBT · pith_short_8: JQV66JLA

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/JQV66JLA35IBZFBTFZ3NRYLFR7 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 4c2bef2560df501c94332e76d8e1658fe77d63b751508f4118a5d4e623f63c80

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "93987a7bf6ec82cff30bd36782bcb0930d5cc6ddbca93afde4947e0547ac096e",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by-sa/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2024-07-14T23:52:25Z",
    "title_canon_sha256": "e1e688186ac8a564ee4148b596d36e6270602308f20fb0f5c063dad5750372a3"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2407.10362",
    "kind": "arxiv",
    "version": 3
  }
}