Federated Learning for Surgical Vision in Appendicitis Classification: Results of the FedSurg EndoVis 2024 Challenge
Pith reviewed 2026-05-18 10:18 UTC · model grok-4.3
The pith
Even with all surgical video data pooled centrally, appendicitis classification reaches only 26.31 percent F1 on an unseen center, and decentralized training adds a further separable penalty while video-level models outperform frame-level,
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the FedSurg challenge on a preliminary subset of the Appendix300 dataset, centralized training achieved only 26.31 percent F1-score when tested on videos from an unseen center. Federated and swarm-learning submissions incurred an additional, measurable performance drop beyond that central baseline. Spatiotemporal models operating on full video clips outperformed frame-by-frame approaches under every aggregation method tested. Naive local fine-tuning on imbalanced per-center data produced classifier collapse, whereas structured personalized federated learning combined with parameter-efficient fine-tuning provided a clearer path for center-specific adaptation.
What carries the argument
The unseen-center generalization split used to separate inherent task difficulty from the effects of data decentralization across centralized, federated, and swarm-learning submissions.
If this is right
- Centralized pooling of multi-center surgical videos still yields only 26.31 percent F1 on unseen centers, showing the task remains hard even without privacy constraints.
- Decentralized training adds a distinct performance penalty separate from the difficulty of the underlying classification problem.
- Video-level spatiotemporal models outperform frame-level models under both centralized and decentralized training.
- Naive local fine-tuning collapses on imbalanced center-specific data.
- Structured personalized federated learning with parameter-efficient fine-tuning offers a more reliable route to center adaptation.
Where Pith is reading between the lines
- Raising the performance ceiling may first require stronger base architectures for temporal surgical data before federation techniques are refined further.
- The reported gaps suggest that larger or more balanced multi-center collections could be needed before such systems reach clinical viability.
- The same unseen-center protocol could be applied to other laparoscopic procedures to check whether temporal modeling remains the dominant factor.
Load-bearing premise
The preliminary subset of the Appendix300 dataset together with the three-submission challenge format and the chosen unseen-center split sufficiently capture the statistical and logistical difficulties of real-world multi-institutional surgical video data.
What would settle it
Re-running the same evaluation protocol on the full Appendix300 dataset or with additional submissions and observing whether F1 on the unseen center rises well above 26.31 percent or whether decentralized methods match centralized performance would directly test the reported limitations.
Figures
read the original abstract
Developing generalizable surgical AI requires multi-institutional data, yet patient privacy constraints preclude direct data sharing, making Federated Learning (FL) a natural candidate solution. The application of FL to complex, spatiotemporal surgical video data remains largely unbenchmarked. We present the FedSurg Challenge, the first international benchmarking initiative dedicated to FL in surgical vision, evaluated as a proof-of-concept on a multi-center laparoscopic appendectomy dataset (preliminary subset of Appendix300). Three submissions were evaluated on generalization to an unseen center and center-specific adaptation. Centralized and Swarm Learning baselines isolate the contributions of task difficulty and decentralization to observed performance. Even with all data pooled centrally, the task achieved only 26.31\% F1-score on the unseen center, while decentralized training introduced an additional, separable performance penalty. Temporal modeling emerges as the dominant architectural factor: video-level spatiotemporal models consistently outperformed frame-level approaches regardless of aggregation strategy. Naive local fine-tuning leads to classifier collapse on imbalanced local data; structured personalized FL with parameter-efficient fine-tuning represents a more principled path toward center-specific adaptation. By characterizing current FL limitations through rigorous statistical analysis, this work establishes a methodological reference point for robust, privacy-preserving AI systems in surgical video analysis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the FedSurg EndoVis 2024 Challenge as the first benchmarking effort for federated learning in surgical vision, using a preliminary subset of the multi-center Appendix300 laparoscopic appendectomy dataset. Three submissions are evaluated for generalization to an unseen center and center-specific adaptation, with comparisons to centralized and swarm learning baselines. Key findings include a low centralized F1-score of 26.31% on the unseen center, an additional performance penalty from decentralization, the superiority of video-level spatiotemporal models over frame-level approaches irrespective of aggregation, and the risks of naive local fine-tuning leading to classifier collapse on imbalanced data.
Significance. If the empirical comparisons hold under more extensive validation, this establishes a useful reference point for privacy-preserving surgical AI by quantifying the inherent difficulty of the appendicitis classification task even in the centralized case and isolating the additional impact of decentralization. The observation that temporal modeling dominates across strategies and the contrast between naive fine-tuning and structured personalized approaches with parameter-efficient fine-tuning provide concrete directions for future work.
major comments (2)
- [Results / Abstract] The assertion that video-level spatiotemporal models 'consistently outperformed' frame-level approaches 'regardless of aggregation strategy' rests on results from only three submissions on a single unseen-center split of the preliminary Appendix300 subset; this sample size is too small to support a general architectural conclusion without additional submissions, cross-validation, or statistical tests for significance of the observed ordering.
- [Abstract / Methods] The claim of a 'separable' performance penalty from decentralized training beyond the centralized 26.31% F1 baseline lacks supporting details on whether the centralized and swarm baselines used identical architectures, hyperparameters, and data preprocessing as the submissions; without this, the isolation of decentralization effects from task difficulty cannot be verified.
minor comments (2)
- [Results] Include explicit statistical tests (e.g., paired t-tests or bootstrap confidence intervals) for all F1 comparisons and report the exact number of videos/frames in the preliminary subset and unseen center.
- [Methods] Clarify the precise definitions of 'video-level spatiotemporal models' versus 'frame-level approaches' and list the three submissions' architectures in a table for reproducibility.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review of our manuscript on the FedSurg EndoVis 2024 Challenge. We address each major comment below in a point-by-point manner and indicate the revisions made to strengthen the paper.
read point-by-point responses
-
Referee: The assertion that video-level spatiotemporal models 'consistently outperformed' frame-level approaches 'regardless of aggregation strategy' rests on results from only three submissions on a single unseen-center split of the preliminary Appendix300 subset; this sample size is too small to support a general architectural conclusion without additional submissions, cross-validation, or statistical tests for significance of the observed ordering.
Authors: We agree that the small number of submissions limits the generalizability of this observation. The challenge received only three valid submissions, and all results are reported on a single fixed unseen-center split of the preliminary dataset. In the revised manuscript we have softened the language in the abstract and results to state that spatiotemporal models outperformed frame-level approaches among the submitted methods, rather than claiming a general architectural principle. We have added an explicit limitations section noting the preliminary nature of the finding, the absence of cross-validation or significance testing due to the challenge format, and the need for future challenges with larger numbers of participants to confirm the pattern. Raw per-submission scores are already provided so readers can evaluate consistency directly. revision: yes
-
Referee: The claim of a 'separable' performance penalty from decentralized training beyond the centralized 26.31% F1 baseline lacks supporting details on whether the centralized and swarm baselines used identical architectures, hyperparameters, and data preprocessing as the submissions; without this, the isolation of decentralization effects from task difficulty cannot be verified.
Authors: We have revised the Methods section to provide the requested details. The centralized and swarm baselines were run on the identical data splits, preprocessing pipeline (including video sampling, normalization, and augmentation), and evaluation protocol as the federated submissions. Where architectures overlapped with submitted methods we reused the same backbone and hyperparameters; otherwise we selected representative models matched as closely as possible to the challenge task. A new supplementary table now lists the exact configuration for each baseline to make the isolation of decentralization effects transparent. revision: yes
Circularity Check
No circularity: empirical challenge results are self-contained
full rationale
The paper reports direct empirical F1-scores and performance comparisons from three challenge submissions plus centralized (26.31% on unseen center) and Swarm baselines on a preliminary Appendix300 subset. Claims about temporal-model dominance and separable decentralization penalty rest on these held-out evaluations and standard FL aggregation, with no equations, fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations that reduce the reported outcomes to inputs by construction. The experimental design isolates task difficulty from decentralization without circular reduction.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Standard definitions of precision, recall, and F1-score for multi-class classification.
- domain assumption The preliminary subset of Appendix300 is representative of multi-center laparoscopic appendectomy video distributions.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Temporal modeling emerges as the dominant architectural factor: video-level spatiotemporal models consistently outperformed frame-level approaches regardless of aggregation strategy.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Centralized and Swarm Learning baselines isolate the contributions of task difficulty and decentralization to observed performance.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
L. Maier-Hein, S. Vedula, S. Speidel, N. Navab, R. Kikinis, A. Park, M. Eisenmann, H. Feussner, G. Forestier, S. Giannarou, M. Hashizume, D. Katić, H. Kenngott, M. Kranzfelder, A. Malpani, K. März, T. Neumuth, N. Padoy, C. Pugh, P. Jannin, Surgical data science for next-generation interventions, Nature Biomedical Engineer- ing 1 (Sep. 2017).doi:10.1038/s4...
-
[2]
J. M. Brandenburg, A. C. Jenke, A. Stern, M. T. J. Daum, A. Schulze, R. Younis, P. Petrynowski, T. Davitashvili, V. Vanat, N. Bhasker, S. Schneider, L. Münder- mann, A. Reinke, F. R. Kolbinger, V. Jörns, F. Fritz-Kebede, M. Dugas, L. Maier- Hein, R. Klotz, M. Distler, J. Weitz, B. P. Müller-Stich, S. Speidel, S. Bodenstedt, M. Wagner, Active learning for ...
-
[3]
L. Maier-Hein, M. Eisenmann, D. Sarikaya, K. März, T. Collins, A. Malpani, J. Fallert, H. Feussner, S. Giannarou, P. Mascagni, H. Nakawala, A. Park, C. Pugh, D. Stoyanov, S. S. Vedula, K. Cleary, G. Fichtinger, G. Forestier, B. Gibaud, T. Grantcharov, M. Hashizume, D. Heckmann-Nötzel, H. G. Kenngott, R. Kikinis, L. Mündermann, N. Navab, S. Onogur, T. Roß,...
-
[4]
M. Carstens, S. Vasisht, Z. Zhang, I. Barbur, A. Reinke, L. Maier-Hein, D. A. Hashimoto, F. R. Kolbinger, Artificial intelligence for surgical scene understanding: A systematic review and reporting quality meta-analysis, ISSN: 3067-2007 Pages: 2025.07.12.25330122 (2025).doi:10.1101/2025.07.12.25330122. URLhttps://www.medrxiv.org/content/10.1101/2025.07.12...
-
[5]
K. Kirtac, N. Aydin, J. Lavanchy, G. Beldi, M. Smit, M. Woods, F. Aspart, Surgical Phase Recognition: From Public Datasets to Real-World Data, Applied Sciences 12 (2022) 8746.doi:10.3390/app12178746
-
[6]
J. L. Lavanchy, S. Ramesh, D. Dall’Alba, C. Gonzalez, P. Fiorini, B. Muller-Stich, P. C. Nett, J. Marescaux, D. Mutter, N. Padoy, Challenges in Multi-centric Gen- eralization: Phase and Step Recognition in Roux-en-Y Gastric Bypass Surgery, arXiv:2312.11250 [cs] (Dec. 2023).doi:10.48550/arXiv.2312.11250. URLhttp://arxiv.org/abs/2312.11250
-
[7]
O. f. C. Rights (OCR), Health information privacy, last Modified: 2025-06- 27T11:38:47-0400 (2021). URLhttps://www.hhs.gov/hipaa/index.html 26
work page 2025
-
[8]
General data protection regulation (GDPR) – legal text (2016). URLhttps://gdpr-info.eu/
work page 2016
-
[9]
B. McMahan, E. Moore, D. Ramage, S. Hampson, B. A. y. Arcas, Communication- Efficient Learning of Deep Networks from Decentralized Data, in: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR, 2017, pp. 1273–1282, iSSN: 2640-3498. URLhttps://proceedings.mlr.press/v54/mcmahan17a.html
work page 2017
-
[10]
D. Yin, Y. Chen, R. Kannan, P. Bartlett, Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates, in: Proceedings of the 35th International Con- ference on Machine Learning, PMLR, 2018, pp. 5650–5659, iSSN: 2640-3498. URLhttps://proceedings.mlr.press/v80/yin18a.html
work page 2018
-
[11]
Roth, Shadi Albarqouni, Spyridon Bakas, Mathieu N
N. Rieke, J. Hancox, W. Li, F. Milletarì, H. R. Roth, S. Albarqouni, S. Bakas, M. N. Galtier, B. A. Landman, K. Maier-Hein, S. Ourselin, M. Sheller, R. M. Summers, A. Trask, D. Xu, M. Baust, M. J. Cardoso, The future of digital health with federated learning, npj Digital Medicine 3 (1) (2020) 119, publisher: Nature Publishing Group. doi:10.1038/s41746-020...
-
[12]
T. Li, A. K. Sahu, A. Talwalkar, V. Smith, Federated Learning: Challenges, Methods, and Future Directions, IEEE Signal Processing Magazine 37 (3) (2020) 50–60.doi: 10.1109/MSP.2020.2975749. URLhttps://ieeexplore.ieee.org/document/9084352
-
[13]
arXiv preprint arXiv:1912.04977 (2019)
P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings, R. G. L. D’Oliveira, H. Eich- ner, S. E. Rouayheb, D. Evans, J. Gardner, Z. Garrett, A. Gascón, B. Ghazi, P. B. Gibbons, M. Gruteser, Z. Harchaoui, C. He, L. He, Z. Huo, B. Hutchinson, J. Hsu, M. Jaggi, T. Javidi, G. Joshi, M. Khodak...
-
[14]
URLhttp://arxiv.org/abs/2208.03392
A.Rauniyar, D.H.Hagos, D.Jha, J.E.Håkegård, U.Bagci, D.B.Rawat, V.Vlassov, Federated learning for medical applications: A taxonomy, current trends, chal- lenges, and future research directions (2023).arXiv:2208.03392[cs],doi: 10.48550/arXiv.2208.03392. URLhttp://arxiv.org/abs/2208.03392
-
[15]
A. Z. Tan, H. Yu, L. Cui, Q. Yang, Towards personalized federated learning, IEEE Transactions on Neural Networks and Learning Systems 34 (12) (2023) 9587–9603. doi:10.1109/TNNLS.2022.3160699
-
[16]
T. Li, M. Sanjabi, A. Beirami, V. Smith, Fair resource allocation in federated learning (2020).arXiv:1905.10497[cs],doi:10.48550/arXiv.1905.10497. URLhttp://arxiv.org/abs/1905.10497 27
-
[17]
H. Kassem, D. Alapatt, P. Mascagni, C. AI4SafeChole, A. Karargyris, N. Padoy, Fed- erated cycling (FedCy): Semi-supervised federated learning of surgical phases, IEEE Transactions on Medical Imaging (2022) 1–1Conference Name: IEEE Transactions on Medical Imaging.doi:10.1109/TMI.2022.3222126
-
[18]
M. Kirchner, A. C. Jenke, S. Bodenstedt, F. R. Kolbinger, O. L. Saldanha, J. N. Kather, M. Wagner, S. Speidel, Federated EndoViT: Pretraining Vision Transformers via Federated Learning on Endoscopic Image Collections, arXivArXiv:2504.16612 [cs] (May 2025).doi:10.48550/arXiv.2504.16612. URLhttp://arxiv.org/abs/2504.16612
-
[19]
Y. Li, S. S. Kundu, M. Boels, T. Mahmoodi, S. Ourselin, T. Vercauteren, P. Das- gupta, J. Shapey, A. Granados, UltraFlwr – an efficient federated medical and surgical object detection framework (2025).arXiv:2503.15161[cs],doi: 10.48550/arXiv.2503.15161. URLhttp://arxiv.org/abs/2503.15161
-
[20]
S. Speidel, L. Maier-Hein, D. Stoyanov, S. Bodenstedt, A. Reinke, S. Bano, Endo- scopic Vision Challenge – A MICCAI Challenge. URLhttps://opencas.dkfz.de/endovis/
-
[21]
F. R. Kolbinger, M. Kirchner, K. Pfeiffer, S. Bodenstedt, A. C. Jenke, J. Barthel, M. R. Carstens, K. Dehlke, S. Dietz, S. Emmanouilidis, G. Fitze, L. Leiter- mann, S. T. Mees, S. Pistorius, C. Prudlo, A. Seiberth, J. Schultz, K. Thiel, D. Ziehn, S. Speidel, J. Weitz, J. N. Kather, M. Distler, O. L. Saldanha, Ap- pendix300: A multi-institutional laparosco...
-
[22]
O. L. Saldanha, K. Pfeiffer, S. Bodenstedt, M. Kirchner, A. C. Jenke, C. Barata, S. Barbosa, J. Barthel, M. Carstens, L. T. Castro, K. Dehlke, S. Dietz, S. Emmanoui- lidis, G. Fitze, M. Freitag, F. Holderried, W. Kanjo, L. Leitermann, S. T. Mees, A. S. Soares, M. Pascoal, S. Pistorius, C. Prudlo, J. Schultz, A. Seiberth, K. Thiel, X. Wu, D. Ziehn, S. Spei...
-
[23]
Image Analysis66, 101796, https://doi.org/10.1016/j.media.2020.101796 (2020)
L. Maier-Hein, A. Reinke, M. Kozubek, A. L. Martel, T. Arbel, M. Eisenmann, A. Hanbury, P. Jannin, H. Müller, S. Onogur, J. Saez-Rodriguez, B. van Gin- neken, A. Kopp-Schneider, B. A. Landman, BIAS: Transparent reporting of biomedical image analysis challenges, Medical Image Analysis 66 (2020) 101796. doi:10.1016/j.media.2020.101796. URLhttps://www.scienc...
-
[24]
C. A. Gomes, T. A. Nunes, J. M. Fonseca Chebli, C. S. Junior, C. C. Gomes, La- paroscopy grading system of acute appendicitis: new insight for future trials, Sur- 28 gical Laparoscopy, Endoscopy & Percutaneous Techniques 22 (5) (2012) 463–466. doi:10.1097/SLE.0b013e318262edf1
-
[25]
Tomar, Converting video formats with ffmpeg, Linux Journal 2006 (146) (2006) 10
S. Tomar, Converting video formats with ffmpeg, Linux Journal 2006 (146) (2006) 10
work page 2006
-
[26]
L. R. Dice, Measures of the amount of ecologic associa- tion between species, Ecology 26 (3) (1945) 297–302, _eprint: https://esajournals.onlinelibrary.wiley.com/doi/pdf/10.2307/1932409.doi: 10.2307/1932409. URLhttps://onlinelibrary.wiley.com/doi/abs/10.2307/1932409
-
[27]
L. Maier-Hein, A. Reinke, P. Godau, M. D. Tizabi, F. Buettner, E. Christodoulou, B. Glocker, F. Isensee, J. Kleesiek, M. Kozubek, M. Reyes, M. A. Riegler, M. Wiesen- farth, A. E. Kavur, C. H. Sudre, M. Baumgartner, M. Eisenmann, D. Heckmann- Nötzel, T. Rädsch, L. Acion, M. Antonelli, T. Arbel, S. Bakas, A. Benis, M. B. Blaschko, M. J. Cardoso, V. Cheplygi...
- [28]
-
[29]
C. Elkan, The foundations of cost-sensitive learning, in: Proceedings of the 17th International Joint Conference on Artificial Intelligence (IJCAI), 2001, pp. 973–978
work page 2001
-
[30]
B. Efron, Bootstrap methods: another look at the jackknife, in: Breakthroughs in statistics: Methodology and distribution, Springer, 1992, pp. 569–593
work page 1992
-
[31]
Commun.9, 10.1038/s41467-018-07619-7 (2018)
L. Maier-Hein, M. Eisenmann, A. Reinke, S. Onogur, M. Stankovic, P. Scholz, T. Ar- bel, H. Bogunovic, A. P. Bradley, A. Carass, C. Feldmann, A. F. Frangi, P. M. Full, B. van Ginneken, A. Hanbury, K. Honauer, M. Kozubek, B. A. Landman, K. März, O. Maier, K. Maier-Hein, B. H. Menze, H. Müller, P. F. Neher, W. Niessen, N. Rajpoot, G. C. Sharp, K. Sirinukunwa...
-
[32]
A. Arnab, M. Dehghani, G. Heigold, C. Sun, M. Lučić, C. Schmid, ViViT: A Video Vision Transformer, in: Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 6836–6846. 29 URLhttps://openaccess.thecvf.com/content/ICCV2021/html/ Arnab_ViViT_A_Video_Vision_Transformer_ICCV_2021_paper. html?ref=https://githubhelp.com
work page 2021
-
[33]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby, An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, arXiv:2010.11929 [cs] (Jun. 2021).doi:10.48550/arXiv.2010.11929. URLhttp://arxiv.org/abs/2010.11929
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2010.11929 2010
-
[34]
D. J. Beutel, T. Topal, A. Mathur, X. Qiu, J. Fernandez-Marques, Y. Gao, L. Sani, K. H. Li, T. Parcollet, P. P. B. d. Gusmão, N. D. Lane, Flower: A Friendly Federated Learning Research Framework, arXiv:2007.14390 [cs] (Mar. 2022).doi:10.48550/ arXiv.2007.14390. URLhttp://arxiv.org/abs/2007.14390
-
[35]
D. Batić, F. Holm, E. Özsoy, T. Czempiel, N. Navab, EndoViT: pretraining vision transformers on a large collection of endoscopic images, International Journal of ComputerAssistedRadiologyandSurgery19(6)(2024)1085–1091.doi:10.1007/ s11548-024-03091-5. URLhttps://doi.org/10.1007/s11548-024-03091-5
-
[36]
S. Yang, F. Zhou, L. Mayer, F. Huang, Y. Chen, Y. Wang, S. He, Y. Nie, X. Wang, Ö. Sümer, Y. Jin, H. Sun, S. Xu, A. Q. Liu, Z. Li, J. Qin, J. Y. Teoh, L. Maier-Hein, H. Chen, Large-scale Self-supervised Video Foundation Model for Intelligent Surgery, arXiv:2506.02692 [cs] (Jun. 2025).doi:10.48550/arXiv.2506.02692. URLhttp://arxiv.org/abs/2506.02692
-
[37]
S. Schmidgall, J. W. Kim, J. Jopling, A. Krieger, General surgery vision transformer: A video pre-trained foundation model for general surgery, arXiv:2403.05949 [cs] (Apr. 2024).doi:10.48550/arXiv.2403.05949. URLhttp://arxiv.org/abs/2403.05949
-
[38]
D. Caldarola, B. Caputo, M. Ciccone, Improving Generalization in Federated Learn- ing by Seeking Flat Minima, in: S. Avidan, G. Brostow, M. Cissé, G. M. Farinella, T. Hassner (Eds.), Computer Vision – ECCV 2022, Springer Nature Switzerland, Cham, 2022, pp. 654–672.doi:10.1007/978-3-031-20050-2_38
-
[39]
Sharpness-Aware Minimization for Efficiently Improving Generalization
P. Foret, A. Kleiner, H. Mobahi, B. Neyshabur, Sharpness-Aware Minimization for Efficiently Improving Generalization, arXiv:2010.01412 [cs] (Apr. 2021).doi:10. 48550/arXiv.2010.01412. URLhttp://arxiv.org/abs/2010.01412
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[40]
arXiv preprint arXiv:2003.00295 , year=
S. Reddi, Z. Charles, M. Zaheer, Z. Garrett, K. Rush, J. Konečný, S. Kumar, H. B. McMahan, Adaptivefederatedoptimization, version: 5.arXiv:2003.00295[cs], doi:10.48550/arXiv.2003.00295. URLhttp://arxiv.org/abs/2003.00295
-
[41]
K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, arXiv:1512.03385 [cs] (Dec. 2015).doi:10.48550/arXiv.1512.03385. URLhttp://arxiv.org/abs/1512.03385 30
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1512.03385 2015
-
[42]
J. Bromley, I. Guyon, Y. LeCun, E. Säckinger, R. Shah, Signature Verification using a "Siamese" Time Delay Neural Network, in: Advances in Neural Information Processing Systems, Vol. 6, Morgan-Kaufmann, 1993. URLhttps://proceedings.neurips.cc/paper/1993/hash/ 288cc0ff022877bd3df94bc9360b9c5d-Abstract.html
work page 1993
-
[43]
P. Luo, R. Zhang, J. Ren, Z. Peng, J. Li, Switchable Normalization for Learning- to-Normalize Deep Representation, IEEE Transactions on Pattern Analysis and Ma- chine Intelligence 43 (2) (2021) 712–728.doi:10.1109/TPAMI.2019.2932062. URLhttps://ieeexplore.ieee.org/abstract/document/8781758 31
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.