GEN-Guard detects generalization failures in federated surgical AI via client-blocked evaluation and corrects them via disagreement-aware distillation, yielding reported gains on in-federation, unseen-institution, and worst-case performance.
Federated Learning for Surgical Vision in Appendicitis Classification: Results of the FedSurg EndoVis 2024 Challenge
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Developing generalizable surgical AI requires multi-institutional data, yet patient privacy constraints preclude direct data sharing, making Federated Learning (FL) a natural candidate solution. The application of FL to complex, spatiotemporal surgical video data remains largely unbenchmarked. We present the FedSurg Challenge, the first international benchmarking initiative dedicated to FL in surgical vision, evaluated as a proof-of-concept on a multi-center laparoscopic appendectomy dataset (preliminary subset of Appendix300). Three submissions were evaluated on generalization to an unseen center and center-specific adaptation. Centralized and Swarm Learning baselines isolate the contributions of task difficulty and decentralization to observed performance. Even with all data pooled centrally, the task achieved only 26.31\% F1-score on the unseen center, while decentralized training introduced an additional, separable performance penalty. Temporal modeling emerges as the dominant architectural factor: video-level spatiotemporal models consistently outperformed frame-level approaches regardless of aggregation strategy. Naive local fine-tuning leads to classifier collapse on imbalanced local data; structured personalized FL with parameter-efficient fine-tuning represents a more principled path toward center-specific adaptation. By characterizing current FL limitations through rigorous statistical analysis, this work establishes a methodological reference point for robust, privacy-preserving AI systems in surgical video analysis.
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
GEN-Guard: Correcting Generalization Failures for Deployable Federated Surgical AI
GEN-Guard detects generalization failures in federated surgical AI via client-blocked evaluation and corrects them via disagreement-aware distillation, yielding reported gains on in-federation, unseen-institution, and worst-case performance.