AutoMedBench evaluates AI agents on long-horizon medical workflows across five stages and finds validation and submission as dominant failure points based on thousands of runs.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
A new quality-guided approach for semi-supervised medical image segmentation that trains a predictor on synthetic errors to enhance pseudolabel handling.
citing papers explorer
-
AutoMedBench: Towards Medical AutoResearch with Agentic AI Models
AutoMedBench evaluates AI agents on long-horizon medical workflows across five stages and finds validation and submission as dominant failure points based on thousands of runs.
-
Quality-Guided Semi-Supervised Learning for Medical Image Segmentation
A new quality-guided approach for semi-supervised medical image segmentation that trains a predictor on synthetic errors to enhance pseudolabel handling.