Sum-of-Checks decomposes CVS criteria into expert checks evaluated by LVLMs with weighted aggregation, yielding 12-14% better frame-level mAP on Endoscapes2023 benchmark.
The sages critical view of safety challenge: A global benchmark for ai-assisted surgical quality assessment
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
A multi-stage Delphi consensus with 92 experts catalogs widespread validation pitfalls in surgical AI video analysis across data, metrics, and reporting, supported by a systematic review and empirical experiments.
citing papers explorer
-
Sum-of-Checks: Structured Reasoning for Surgical Safety with Large Vision-Language Models
Sum-of-Checks decomposes CVS criteria into expert checks evaluated by LVLMs with weighted aggregation, yielding 12-14% better frame-level mAP on Endoscapes2023 benchmark.
-
Current validation practice undermines surgical AI development
A multi-stage Delphi consensus with 92 experts catalogs widespread validation pitfalls in surgical AI video analysis across data, metrics, and reporting, supported by a systematic review and empirical experiments.