TxBench-PP benchmark shows leading AI agents achieve at most 59% success on tasks requiring recovery of preclinical pharmacology conclusions from assay data.
Corsello, David D
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
dataset 1
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
dataset 1polarities
use dataset 1representative citing papers
Drug-blind cancer sensitivity prediction is limited by evaluation metric and training distribution rather than drug representation complexity.
CellScientist introduces a dual-space hierarchical orchestration system that enables closed-loop refinement of virtual cell models by routing execution discrepancies back to hypothesis or implementation updates, yielding improved benchmark performance with auditable traces.
citing papers explorer
No citing papers match the current filters.