Guttag, and Adrian V

Andrew Hoopes, Neel Dey, Victor Ion Butoi, John V · 2024 · arXiv 2410.08397

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

MedOpenClaw and MedFlowBench: Auditing Medical Agents in Full-Study Workflows

cs.CV · 2026-03-25 · conditional · novelty 8.0

MedFlowBench evaluates VLM agents on full radiology and pathology studies by requiring both task answers and verifiable evidence like key slices and regions of interest, revealing that answer-only scores overestimate performance.

VERITAS: Verifiable Epistemic Reasoning for Image-Derived Hypothesis Testing via Agentic Systems

cs.MA · 2026-04-13 · unverdicted · novelty 7.0

VERITAS is a multi-agent system for verifiable hypothesis testing on multimodal clinical MRI datasets that achieves 81.4% verdict accuracy with frontier models and introduces an epistemic evidence labeling framework.

GAZE: Grounded Agentic Zero-shot Evaluation with Viewer-Level Tools and Literature Retrieval on Rare Brain MRI

cs.LG · 2026-04-25 · unverdicted · novelty 6.0

GAZE framework with viewer tools and literature retrieval achieves 58.2 mAP@0.3 lesion localization and 34.9% top-1 diagnostic accuracy on 906 rare brain MRI cases in zero-shot setting, with larger gains on rarest pathologies.

AgentRx: A Benchmark Study of LLM Agents for Multimodal Clinical Prediction Tasks

cs.AI · 2026-05-11 · unverdicted · novelty 5.0

Single-agent LLM frameworks outperform naive multi-agent systems in multimodal clinical risk prediction tasks and are better calibrated.

citing papers explorer

Showing 4 of 4 citing papers.

MedOpenClaw and MedFlowBench: Auditing Medical Agents in Full-Study Workflows cs.CV · 2026-03-25 · conditional · none · ref 36
MedFlowBench evaluates VLM agents on full radiology and pathology studies by requiring both task answers and verifiable evidence like key slices and regions of interest, revealing that answer-only scores overestimate performance.
VERITAS: Verifiable Epistemic Reasoning for Image-Derived Hypothesis Testing via Agentic Systems cs.MA · 2026-04-13 · unverdicted · none · ref 15
VERITAS is a multi-agent system for verifiable hypothesis testing on multimodal clinical MRI datasets that achieves 81.4% verdict accuracy with frontier models and introduces an epistemic evidence labeling framework.
GAZE: Grounded Agentic Zero-shot Evaluation with Viewer-Level Tools and Literature Retrieval on Rare Brain MRI cs.LG · 2026-04-25 · unverdicted · none · ref 13
GAZE framework with viewer tools and literature retrieval achieves 58.2 mAP@0.3 lesion localization and 34.9% top-1 diagnostic accuracy on 906 rare brain MRI cases in zero-shot setting, with larger gains on rarest pathologies.
AgentRx: A Benchmark Study of LLM Agents for Multimodal Clinical Prediction Tasks cs.AI · 2026-05-11 · unverdicted · none · ref 58
Single-agent LLM frameworks outperform naive multi-agent systems in multimodal clinical risk prediction tasks and are better calibrated.

Guttag, and Adrian V

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer