Federated Learning for Surgical Vision in Appendicitis Classification: Results of the FedSurg EndoVis 2024 Challenge

· 2025 · cs.CV · arXiv 2510.04772

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Developing generalizable surgical AI requires multi-institutional data, yet patient privacy constraints preclude direct data sharing, making Federated Learning (FL) a natural candidate solution. The application of FL to complex, spatiotemporal surgical video data remains largely unbenchmarked. We present the FedSurg Challenge, the first international benchmarking initiative dedicated to FL in surgical vision, evaluated as a proof-of-concept on a multi-center laparoscopic appendectomy dataset (preliminary subset of Appendix300). Three submissions were evaluated on generalization to an unseen center and center-specific adaptation. Centralized and Swarm Learning baselines isolate the contributions of task difficulty and decentralization to observed performance. Even with all data pooled centrally, the task achieved only 26.31\% F1-score on the unseen center, while decentralized training introduced an additional, separable performance penalty. Temporal modeling emerges as the dominant architectural factor: video-level spatiotemporal models consistently outperformed frame-level approaches regardless of aggregation strategy. Naive local fine-tuning leads to classifier collapse on imbalanced local data; structured personalized FL with parameter-efficient fine-tuning represents a more principled path toward center-specific adaptation. By characterizing current FL limitations through rigorous statistical analysis, this work establishes a methodological reference point for robust, privacy-preserving AI systems in surgical video analysis.

representative citing papers

GEN-Guard: Correcting Generalization Failures for Deployable Federated Surgical AI

cs.CV · 2026-06-18 · unverdicted · novelty 5.0

GEN-Guard detects generalization failures in federated surgical AI via client-blocked evaluation and corrects them via disagreement-aware distillation, yielding reported gains on in-federation, unseen-institution, and worst-case performance.

citing papers explorer

Showing 1 of 1 citing paper.

GEN-Guard: Correcting Generalization Failures for Deployable Federated Surgical AI cs.CV · 2026-06-18 · unverdicted · none · ref 14 · internal anchor
GEN-Guard detects generalization failures in federated surgical AI via client-blocked evaluation and corrects them via disagreement-aware distillation, yielding reported gains on in-federation, unseen-institution, and worst-case performance.

Federated Learning for Surgical Vision in Appendicitis Classification: Results of the FedSurg EndoVis 2024 Challenge

fields

years

verdicts

representative citing papers

citing papers explorer