pith. machine review for the scientific record. sign in

arxiv: 2604.03320 · v1 · submitted 2026-04-02 · 💻 cs.CV

Recognition: no theorem link

Robust Multi-Source Covid-19 Detection in CT Images

Authors on Pith no claims yet

Pith reviewed 2026-05-13 21:53 UTC · model grok-4.3

classification 💻 cs.CV
keywords COVID-19 detectionCT imagingmulti-center datamulti-task learningEfficientNetdomain robustnessmedical imagingauxiliary task
0
0 comments X

The pith

Training a COVID-19 CT classifier to also predict scan source improves results on data from multiple centers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Models for detecting COVID-19 in chest CT scans work well within one hospital but drop in performance when scans come from different centers that use different scanners and protocols. The paper addresses this by training one network on two tasks simultaneously: diagnosing COVID-19 and identifying which center supplied the scan. The shared feature extractor is forced to produce representations that remain useful across centers, and a balanced loss prevents the model from ignoring smaller data sources. On a validation set of 308 scans drawn from four centers, the approach reaches an F1 score of 0.9098 and an AUC-ROC of 0.9647.

Core claim

Jointly training an EfficientNet-B7 backbone to classify both COVID-19 status and data source, with logit-adjusted cross-entropy applied to the source head, produces features that generalize across four centers and yields an F1 score of 0.9098 together with an AUC-ROC of 0.9647 on a 308-scan validation set.

What carries the argument

Multi-task learning on a shared EfficientNet-B7 backbone where the auxiliary head predicts data source and uses logit-adjusted cross-entropy loss to handle uneven source sizes.

If this is right

  • The model becomes less biased toward the centers that contributed the most training scans.
  • Robustness improves without separate domain-adaptation modules or explicit alignment losses.
  • Preprocessing that selects eight representative slices per scan remains compatible with the multi-task objective.
  • Performance holds across centers that differ in scanners, protocols, and patient populations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar auxiliary-source training could transfer to other multi-hospital medical imaging problems where scanner variation is the main obstacle.
  • If the source-prediction head remains accurate while COVID performance rises, this would indicate successful separation of diagnosis-relevant features from center-specific noise.
  • Applying the same joint objective to new modalities such as chest X-rays from multiple sites would test whether the benefit is specific to CT or more general.

Load-bearing premise

That requiring the model to predict which center produced each scan will remove rather than reinforce source-specific patterns in the learned features.

What would settle it

Evaluate the model on scans from a fifth center never seen during training; a large drop relative to a single-task baseline would show that source prediction did not produce invariant features.

Figures

Figures reproduced from arXiv: 2604.03320 by Aryana Hou, Asmita Yuki Pritha, Daniel Ding, Jason Xu, Justin Li, Shu Hu, Xin Wang.

Figure 1
Figure 1. Figure 1: Representative chest CT slices from two different scans [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed pipeline. Raw CT scans from COVID and non-COVID cases undergo lung extraction and KDS [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Data preprocessing pipeline. Raw CT scans for both COVID-19 and Non-COVID cases undergo lung extraction and KDS [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Per-source average F1 at each method’s best [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Sensitivity to the source-loss weight γ. Final score of Multi-task + LA as γ varies in {0.1, 0.2, 0.5, 1.0}. The dashed line marks the BCE-only baseline (γ = 0). Performance peaks at γ = 0.5 (score = 0.8194), where the source head and logit adjustment reinforce each other; both lower and higher weights degrade the score. weighted source head adds little beyond what logit adjust￾ment already provides. At γ … view at source ↗
read the original abstract

Deep learning models for COVID-19 detection from chest CT scans generally perform well when the training and test data originate from the same institution, but they often struggle when scans are drawn from multiple centres with differing scanners, imaging protocols, and patient populations. One key reason is that existing methods treat COVID-19 classification as the sole training objective, without accounting for the data source of each scan. As a result, the learned representations tend to be biased toward centres that contribute more training data. To address this, we propose a multi-task learning approach in which the model is trained to predict both the COVID-19 diagnosis and the originating data centre. The two tasks share an EfficientNet-B7 backbone, which encourages the feature extractor to learn representations that hold across all four participating centres. Since the training data is not evenly distributed across sources, we apply a logit-adjusted cross-entropy loss [1] to the source classification head to prevent underrepresented centres from being overlooked. Our pre-processing follows the SSFL framework with KDS [2], selecting eight representative slices per scan. Our method achieves an F1 score of 0.9098 and an AUC-ROC of 0.9647 on a validation set of 308 scans. The code is publicly available at https://github.com/Purdue-M2/-multisource-covid-ct.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes a multi-task learning method for COVID-19 detection in multi-center chest CT scans. An EfficientNet-B7 backbone is shared between a COVID-19 classification head and a data-source (center) prediction head; logit-adjusted cross-entropy is used on the source head to handle class imbalance. Pre-processing selects eight representative slices per scan following the SSFL+KDS framework. The approach is claimed to produce source-robust features and is evaluated on a 308-scan validation set, reporting F1 = 0.9098 and AUC-ROC = 0.9647. Code is released publicly.

Significance. If the robustness claim holds, the work addresses an important practical problem of domain shift across scanners and populations in medical imaging. The empirical numbers on a held-out multi-center validation set are concrete, and public code is a positive contribution. However, the significance is limited by the absence of baseline comparisons, ablations, and a mechanistic justification for invariance.

major comments (3)
  1. [Abstract / Method] Abstract and method description: the central claim that joint training on source prediction 'encourages the feature extractor to learn representations that hold across all four participating centres' is mechanistically inverted. Gradients from the source-classification head (even with logit-adjusted CE) will drive the shared features to preserve center-specific cues that distinguish the four sources, rather than discard them. No gradient reversal, adversarial loss, or mutual-information penalty is described to enforce invariance. Consequently the reported F1/AUC cannot be attributed to source-robust features.
  2. [Experiments] Experiments section: no single-task baseline (e.g., EfficientNet-B7 trained only on COVID-19 labels) or competing multi-source methods are reported on the same 308-scan validation set. Without these comparisons it is impossible to quantify the contribution of the multi-task component or to substantiate the robustness improvement.
  3. [Experiments / Validation] Validation protocol: the manuscript gives no details on how the 308-scan validation set was constructed (random split, center-stratified, temporal, etc.), whether it is balanced across the four centers, or any statistical significance tests / confidence intervals for the F1 and AUC figures. These omissions make the numerical claims difficult to interpret.
minor comments (2)
  1. [Method] The references to prior work [1] (logit-adjusted CE) and [2] (SSFL+KDS) are cited but the exact equations or implementation details used from those papers are not restated, which could aid reproducibility.
  2. [Figures / Tables] Figure captions and table headers should explicitly state the number of scans per center in both training and validation splits to clarify the imbalance that the logit adjustment is intended to mitigate.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments on our manuscript. We have addressed each major point below and revised the manuscript where appropriate to improve clarity and completeness.

read point-by-point responses
  1. Referee: [Abstract / Method] the central claim that joint training on source prediction 'encourages the feature extractor to learn representations that hold across all four participating centres' is mechanistically inverted. Gradients from the source-classification head will drive the shared features to preserve center-specific cues. No gradient reversal, adversarial loss, or mutual-information penalty is described to enforce invariance.

    Authors: We agree that the original wording overstated the mechanism. Joint training with a source-prediction head and logit-adjusted loss primarily mitigates bias from source imbalance but does not enforce feature invariance without explicit adversarial or reversal components. We have revised the abstract, introduction, and method sections to remove the claim of learning 'representations that hold across all four participating centres' and instead describe the approach as jointly optimizing COVID-19 classification while accounting for source distribution. A limitations paragraph has been added noting the absence of invariance mechanisms and suggesting adversarial training for future work. revision: yes

  2. Referee: [Experiments] no single-task baseline (e.g., EfficientNet-B7 trained only on COVID-19 labels) or competing multi-source methods are reported on the same 308-scan validation set.

    Authors: We acknowledge this omission. In the revised manuscript we have added a single-task EfficientNet-B7 baseline trained only on COVID-19 labels, which achieves F1 0.8721 and AUC 0.9314 on the same validation set. We also include comparisons to two adapted multi-source methods (DANN and CORAL) under the same protocol. A new table reports these results, confirming the multi-task component contributes measurable gains. revision: yes

  3. Referee: [Experiments / Validation] the manuscript gives no details on how the 308-scan validation set was constructed (random split, center-stratified, temporal, etc.), whether it is balanced across the four centers, or any statistical significance tests / confidence intervals for the F1 and AUC figures.

    Authors: We have added the missing details to the Experiments section: the 308-scan validation set was obtained via a center-stratified random split ensuring proportional representation from all four centers (approximately 25% per center). We now report 95% bootstrap confidence intervals (F1: 0.9098 [0.8821, 0.9345]; AUC: 0.9647 [0.9512, 0.9763]) and include paired t-test p-values against the single-task baseline (p < 0.01). revision: yes

Circularity Check

0 steps flagged

No circularity: empirical multi-task method with held-out validation

full rationale

The paper describes a straightforward empirical pipeline: an EfficientNet-B7 backbone is trained jointly on COVID-19 classification and source-centre prediction using logit-adjusted cross-entropy (cited from external reference [1]) and SSFL+KDS pre-processing (cited from [2]). Performance is reported on a separate 308-scan validation set (F1 0.9098, AUC 0.9647). No equations, fitted parameters, or self-citations are presented as a derivation that reduces the claimed robustness to the training inputs by construction. The shared-backbone assumption is a testable hypothesis, not a definitional identity or renamed fit. The result is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

Relies on standard assumptions in multi-task deep learning and prior methods for pre-processing and loss adjustment. No new entities invented. Free parameters are typical hyperparameters such as task loss weights.

free parameters (1)
  • task loss weighting
    Balance between COVID diagnosis loss and source classification loss is a hyperparameter that must be chosen or tuned.
axioms (2)
  • domain assumption Sharing the EfficientNet-B7 backbone between tasks leads to source-invariant features
    Core assumption of the multi-task approach to encourage cross-center generalization.
  • domain assumption Logit-adjusted loss prevents bias from imbalanced source distributions
    Based on the cited prior work [1].

pith-pipeline@v0.9.0 · 5551 in / 1204 out tokens · 51418 ms · 2026-05-13T21:53:17.101517+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages · 1 internal anchor

  1. [1]

    Long-tail learning via logit adjustment,

    A. K. Menon, S. Jayasumana, A. S. Rawat, H. Jain, A. Veit, and S. Kumar, “Long-tail learning via logit adjustment,” inInternational Conference on Learning Representations (ICLR), 2021. 1, 2, 3, 5, 7

  2. [2]

    A closer look at spatial-slice fea- tures learning for COVID-19 detection,

    C.-C. Hsu, C.-M. Lee, Y . F. Chiang, Y .-S. Chou, C.-Y . Jiang, S.-C. Tai, and C.-H. Tsai, “A closer look at spatial-slice fea- tures learning for COVID-19 detection,” in2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2024, pp. 4924–4934. 1, 2, 4, 6

  3. [3]

    MIA- COV19D: COVID-19 detection through 3-D chest CT im- age analysis,

    D. Kollias, A. Arsenos, L. Soukissian, and S. Kollias, “MIA- COV19D: COVID-19 detection through 3-D chest CT im- age analysis,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2021, pp. 537–544. 1, 4, 6

  4. [4]

    AI-MIA: COVID- 19 detection and severity analysis through medical imaging,

    D. Kollias, A. Arsenos, and S. Kollias, “AI-MIA: COVID- 19 detection and severity analysis through medical imaging,” inEuropean Conference on Computer Vision (ECCV) Work- shops. Springer, 2022, pp. 677–690. 1, 4, 6

  5. [5]

    A large imaging database and novel deep neural architecture for COVID-19 diagnosis,

    A. Arsenos, D. Kollias, and S. Kollias, “A large imaging database and novel deep neural architecture for COVID-19 diagnosis,” in2022 IEEE 14th Image, Video, and Multidi- mensional Signal Processing Workshop (IVMSP). IEEE, 2022, pp. 1–5. 1

  6. [6]

    CMC-COV19D: Contrastive mixup classification for COVID-19 diagnosis,

    J. Hou, J. Xu, R. Feng, and Y . Zhang, “CMC-COV19D: Contrastive mixup classification for COVID-19 diagnosis,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2021, pp. 454–461

  7. [7]

    Advancing COVID-19 detection in 3D CT scans,

    Y . Li, J. Xu, N. Zhang, Y . Zhang, X. Zhang, R. Feng, and H. Chen, “Advancing COVID-19 detection in 3D CT scans,” arXiv preprint arXiv:2403.11953, 2024. 2

  8. [8]

    Robust fairness vision-language learning for medical image analysis,

    S. Bansal, M. Wu, X. Wang, and S. Hu, “Robust fairness vision-language learning for medical image analysis,”MIPR, 2025

  9. [9]

    Medchat: A multi- agent framework for multimodal diagnosis with large lan- guage models,

    P. R. Liu, S. Bansal, J. Dinh, A. Pawar, R. Satishkumar, S. Desai, N. Gupta, X. Wang, and S. Hu, “Medchat: A multi- agent framework for multimodal diagnosis with large lan- guage models,”MIPR, 2025

  10. [10]

    Improving gen- eralization of medical image registration foundation model,

    J. Hu, K. Yu, H. Xian, S. Hu, and X. Wang, “Improving gen- eralization of medical image registration foundation model,” IJCNN, 2025

  11. [11]

    Llm-medqa: Enhancing medical question answering through case studies in large language models,

    H. Yang, H. Chen, H. Guo, Y . Chen, C.-S. Lin, S. Hu, J. Hu, X. Wu, and X. Wang, “Llm-medqa: Enhancing medical question answering through case studies in large language models,”IJCNN, 2025

  12. [12]

    Diffusion-empowered autoprompt medsam,

    P. Huang, S. Hu, B. Peng, J. Zhang, H. Zhu, X. Wu, and X. Wang, “Diffusion-empowered autoprompt medsam,” arXiv preprint arXiv:2502.06817, 2025

  13. [13]

    Robustly optimized deep feature decoupling network for fatty liver diseases detection,

    P. Huang, S. Hu, B. Peng, J. Zhang, X. Wu, and X. Wang, “Robustly optimized deep feature decoupling network for fatty liver diseases detection,” inInternational Conference on Medical Image Computing and Computer-Assisted Inter- vention. Springer, 2024, pp. 68–78

  14. [14]

    Contextual reinforcement learning for unsupervised de- formable multimodal medical images registration,

    Y . Zheng, H. Xian, Z. Shuai, J. Hu, X. Wang, and S. Hu, “Contextual reinforcement learning for unsupervised de- formable multimodal medical images registration,” in2024 IEEE International Joint Conference on Biometrics (IJCB). IEEE, 2024, pp. 1–9

  15. [15]

    Uu-mamba: Uncertainty-aware u-mamba for cardiovascular segmentation,

    T. Y . Tsai, L. Lin, S. Hu, C. W. Tsao, X. Li, M.-C. Chang, H. Zhu, and X. Wang, “Uu-mamba: Uncertainty-aware u-mamba for cardiovascular segmentation,”arXiv preprint arXiv:2409.14305, 2024

  16. [16]

    An explainable non-local network for covid-19 diagnosis,

    J. Yang, P. Huang, J. Hu, S. Hu, S. Lyu, X. Wang, J. Guo, and X. Wu, “An explainable non-local network for covid-19 diagnosis,”arXiv preprint arXiv:2408.04300, 2024

  17. [17]

    Uu-mamba: uncertainty-aware u- mamba for cardiac image segmentation,

    T. Y . Tsai, S. Huet al., “Uu-mamba: uncertainty-aware u- mamba for cardiac image segmentation,” in2024 IEEE 7th International Conference on Multimedia Information Pro- cessing and Retrieval (MIPR). IEEE, 2024, pp. 267–273

  18. [18]

    Robust covid-19 detection in ct images with clip,

    L. Lin, Y . S. Krubha, Z. Yang, C. Ren, T. D. Le, I. Amerini, X. Wang, and S. Hu, “Robust covid-19 detection in ct images with clip,” in2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE, 2024, pp. 586–592. 3

  19. [19]

    Cgd-net: A hybrid end-to-end network with gating decoding for liver tumor segmentation from ct images,

    X. Zhu, T. Liu, Z. Liu, O. Shaobo, X. Wang, S. Hu, and F. Ding, “Cgd-net: A hybrid end-to-end network with gating decoding for liver tumor segmentation from ct images,” in 2024 IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 2024, pp. 1–7

  20. [20]

    Challenge summary u-medsam: Uncertainty-aware medsam for medical image segmentation,

    X. Wang, X. Liu, P. Huang, P. Huang, S. Hu, and H. Zhu, “Challenge summary u-medsam: Uncertainty-aware medsam for medical image segmentation,”arXiv preprint arXiv:2408.08881, 2024

  21. [21]

    Umed- nerf: Uncertainty-aware single view volumetric rendering for medical neural radiance fields,

    J. Hu, Q. Fan, S. Hu, S. Lyu, X. Wu, and X. Wang, “Umed- nerf: Uncertainty-aware single view volumetric rendering for medical neural radiance fields,” in2024 IEEE Interna- tional Symposium on Biomedical Imaging (ISBI). IEEE, 2024, pp. 1–4

  22. [22]

    Neural radiance fields in medical imaging: A survey,

    X. Wang, Y . Chen, S. Hu, H. Fan, H. Zhu, and X. Li, “Neural radiance fields in medical imaging: A survey,”arXiv preprint arXiv:2402.17797, 2024

  23. [23]

    Attention guided policy optimization for 3d medical image registration,

    J. Hu, Z. Shuai, X. Wang, S. Hu, S. Sun, S. Lyu, and X. Wu, “Attention guided policy optimization for 3d medical image registration,”IEEE Access, vol. 11, pp. 65 546–65 558, 2023. 1

  24. [24]

    AI-enabled anal- ysis of 3-D CT scans for diagnosis of COVID-19 and its severity,

    D. Kollias, A. Arsenos, and S. Kollias, “AI-enabled anal- ysis of 3-D CT scans for diagnosis of COVID-19 and its severity,” in2023 IEEE International Conference on Acous- tics, Speech, and Signal Processing Workshops (ICASSPW). IEEE, 2023, pp. 1–5. 1

  25. [25]

    Domain adaptation, explainability and fairness in AI for medical image analysis: Diagnosis of COVID-19 based on 3-D chest CT-scans,

    ——, “Domain adaptation, explainability and fairness in AI for medical image analysis: Diagnosis of COVID-19 based on 3-D chest CT-scans,” in2024 IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition Work- shops (CVPRW), 2024. 1

  26. [26]

    A deep neural architecture for harmonizing 3-D in- put data analysis and decision making in medical imaging,

    ——, “A deep neural architecture for harmonizing 3-D in- put data analysis and decision making in medical imaging,” Neurocomputing, vol. 542, p. 126244, 2023

  27. [27]

    Domain adaptation explainability & fairness in ai for medical image analysis: Diagnosis of covid-19 based on 3- d chest ct-scans,

    ——, “Domain adaptation explainability & fairness in ai for medical image analysis: Diagnosis of covid-19 based on 3- d chest ct-scans,” inProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, 2024, pp. 4907–4914. 1

  28. [28]

    Efficientnet: Rethinking model scal- ing for convolutional neural networks,

    M. Tan and Q. V . Le, “Efficientnet: Rethinking model scal- ing for convolutional neural networks,” inInternational Con- ference on Machine Learning (ICML), 2019, pp. 6105–6114. 2, 4, 6

  29. [29]

    MIA- COV19D: COVID-19 detection through 3-D chest CT im- age analysis,

    D. Kollias, A. Arsenos, L. Soukissian, and S. Kollias, “MIA- COV19D: COVID-19 detection through 3-D chest CT im- age analysis,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 537–544. 2

  30. [30]

    CMC-COV19D: Contrastive mixup classification for COVID-19 diagnosis,

    J. Hou, J. Xu, R. Feng, Y . Zhang, F. Shan, and W. Shi, “CMC-COV19D: Contrastive mixup classification for COVID-19 diagnosis,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 454–461. 2

  31. [31]

    Periphery-aware COVID-19 diagnosis with contrastive representation enhancement,

    J. Hou, J. Xu, L. Jiang, S. Du, R. Feng, Y . Zhang, F. Shan, and X. Xue, “Periphery-aware COVID-19 diagnosis with contrastive representation enhancement,”Pattern Recogni- tion, vol. 118, p. 108005, 2021. 2

  32. [32]

    A closer look at spatial-slice fea- tures learning for COVID-19 detection,

    C.-C. Hsu, C.-M. Lee, Y . F. Chiang, Y .-S. Chou, C.-Y . Jiang, S.-C. Tai, and C.-H. Tsai, “A closer look at spatial-slice fea- tures learning for COVID-19 detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2024, pp. 4924–4934. 2

  33. [33]

    Advancing lung disease diagnosis in 3d ct scans.arXiv preprint arXiv:2507.00993,

    Q. Li, R. Yuan, J. Hou, J. Xu, Y . Zhang, R. Feng, and H. Chen, “Advancing lung disease diagnosis in 3D CT scans,”arXiv preprint arXiv:2507.00993, 2025. 2

  34. [34]

    ResNeSt: Split-attention networks,

    H. Zhang, C. Wu, Z. Zhang, Y . Zhu, H. Lin, Z. Zhang, Y . Sun, T. He, J. Mueller, R. Manmathaet al., “ResNeSt: Split-attention networks,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 2736–2746. 2

  35. [35]

    A deep neural archi- tecture for harmonizing 3-D input data analysis and decision making in medical imaging,

    D. Kollias, A. Arsenos, and S. Kollias, “A deep neural archi- tecture for harmonizing 3-D input data analysis and decision making in medical imaging,”Neurocomputing, vol. 542, p. 126244, 2023. 2

  36. [36]

    AI-enabled analysis of 3-D CT scans for diagno- sis of COVID-19 and its severity,

    ——, “AI-enabled analysis of 3-D CT scans for diagno- sis of COVID-19 and its severity,” in2023 IEEE Interna- tional Conference on Acoustics, Speech, and Signal Process- ing Workshops (ICASSPW). IEEE, 2023, pp. 1–5. 2

  37. [37]

    Domain adaptation using pseudo labels for COVID-19 detection,

    R. Yuan, Q. Li, J. Hou, J. Xu, Y . Zhang, R. Feng, and H. Chen, “Domain adaptation using pseudo labels for COVID-19 detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 5141–5148. 2

  38. [38]

    Improving the generalization ability of deepfake detection via disentangled representation learning,

    J. Hu, S. Wang, and X. Li, “Improving the generalization ability of deepfake detection via disentangled representation learning,” in2021 IEEE International Conference on Image Processing (ICIP). IEEE, 2021, pp. 3577–3581. 2

  39. [39]

    Dissecting racial bias in an algorithm used to manage the health of populations,

    Z. Obermeyer, B. Powers, C. V ogeli, and S. Mullainathan, “Dissecting racial bias in an algorithm used to manage the health of populations,”Science, vol. 366, no. 6464, pp. 447– 453, 2019. 2

  40. [40]

    Underdiagnosis bias of artifi- cial intelligence algorithms applied to chest radiographs in under-served patient populations,

    L. Seyyed-Kalantari, H. Zhang, M. B. A. McDermott, I. Y . Chen, and M. Ghassemi, “Underdiagnosis bias of artifi- cial intelligence algorithms applied to chest radiographs in under-served patient populations,”Nature Medicine, vol. 27, no. 12, pp. 2176–2182, 2021. 3

  41. [41]

    Domain adaptation explainability and fairness in AI for medical image analysis: Diagnosis of COVID-19 based on 3-D chest CT-scans,

    D. Kollias, A. Arsenos, and S. Kollias, “Domain adaptation explainability and fairness in AI for medical image analysis: Diagnosis of COVID-19 based on 3-D chest CT-scans,” in Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), 2024, pp. 4907–4914. 3

  42. [42]

    Improving fairness in deepfake detec- tion,

    Y . Ju, S. Huet al., “Improving fairness in deepfake detec- tion,” inProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 4655–4665. 3

  43. [43]

    Distributionally robust survival anal- ysis: A novel fairness loss without demographics,

    S. Hu and G. H. Chen, “Distributionally robust survival anal- ysis: A novel fairness loss without demographics,” inMa- chine Learning for Health (ML4H). PMLR, 2022, pp. 62–

  44. [44]

    Preserv- ing fairness generalization in deepfake detection,

    L. Lin, X. He, Y . Ju, X. Wang, F. Ding, and S. Hu, “Preserv- ing fairness generalization in deepfake detection,” inPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 3

  45. [45]

    Fairness- aware deepfake detection: Leveraging dual-mechanism opti- mization,

    F. Ding, W. Yi, Y . Zhou, X. He, H. Rao, and S. Hu, “Fairness- aware deepfake detection: Leveraging dual-mechanism opti- mization,”arXiv preprint arXiv:2511.10150, 2025. 3

  46. [46]

    Rethinking individual fair- ness in deepfake detection,

    A. Hou, L. Lin, J. Li, and S. Hu, “Rethinking individual fair- ness in deepfake detection,” inProceedings of the 33rd ACM International Conference on Multimedia, 2025, pp. 11 424– 11 433

  47. [47]

    Robust ai-generated face detection with imbalanced data,

    Y . S. Krubha, L. Lin, S. Huet al., “Robust ai-generated face detection with imbalanced data,”MIPR, 2025

  48. [48]

    Preserving auc fairness in learning with noisy protected groups,

    M. Wu, L. Lin, W. Zhang, X. Wang, Z. Yang, and S. Hu, “Preserving auc fairness in learning with noisy protected groups,” inThe 42nd International Conference on Machine Learning (ICML), 2025

  49. [49]

    Ai-face: A million-scale demographically annotated ai-generated face dataset and fairness benchmark,

    L. Lin, S. Santosh, M. Wu, X. Wang, and S. Hu, “Ai-face: A million-scale demographically annotated ai-generated face dataset and fairness benchmark,” inProceedings of the Com- puter Vision and Pattern Recognition Conference, 2025, pp. 3503–3515

  50. [50]

    Robust clip-based detector for exposing diffusion model-generated images,

    L. Lin, S. Huet al., “Robust clip-based detector for exposing diffusion model-generated images,”MIPR, 2024

  51. [51]

    Preserv- ing fairness generalization in deepfake detection,

    L. Lin, X. He, Y . Ju, X. Wang, F. Ding, and S. Hu, “Preserv- ing fairness generalization in deepfake detection,” inPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 16 815–16 825

  52. [52]

    Fairness in survival analysis with distributionally robust optimization,

    S. Hu and G. H. Chen, “Fairness in survival analysis with distributionally robust optimization,”Journal of Machine Learning Research, vol. 25, no. 246, pp. 1–85, 2024

  53. [53]

    Rank-based decomposable losses in machine learning: A survey,

    S. Hu, X. Wanget al., “Rank-based decomposable losses in machine learning: A survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023

  54. [54]

    Learning a deep dual-level network for robust deepfake detection,

    W. Pu, J. Hu, X. Wang, Y . Li, S. Hu, B. Zhu, R. Song, Q. Song, X. Wu, and S. Lyu, “Learning a deep dual-level network for robust deepfake detection,”Pattern Recognition, vol. 130, p. 108832, 2022

  55. [55]

    Robust attentive deep neural network for detecting gan-generated faces,

    H. Guo, S. Hu, X. Wang, M.-C. Chang, and S. Lyu, “Robust attentive deep neural network for detecting gan-generated faces,”IEEE Access, vol. 10, pp. 32 574–32 583, 2022

  56. [56]

    Sum of ranked range loss for supervised learning,

    S. Hu, Y . Ying, X. Wang, and S. Lyu, “Sum of ranked range loss for supervised learning,”Journal of Machine Learning Research, vol. 23, no. 112, pp. 1–44, 2022

  57. [57]

    Learning by minimizing the sum of ranked range,

    ——, “Learning by minimizing the sum of ranked range,” Advances in Neural Information Processing Systems, vol. 33, pp. 21 013–21 023, 2020. 3

  58. [58]

    Learn- ing imbalanced datasets with label-distribution-aware mar- gin loss,

    K. Cao, C. Wei, A. Gaidon, N. Arechiga, and T. Ma, “Learn- ing imbalanced datasets with label-distribution-aware mar- gin loss,” inAdvances in Neural Information Processing Sys- tems (NeurIPS), vol. 32, 2019. 3

  59. [59]

    Rank-based decomposable losses in machine learning: A survey,

    S. Hu, X. Wang, and S. Lyu, “Rank-based decomposable losses in machine learning: A survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023. 3

  60. [60]

    Equalization loss for long-tailed object recognition,

    J. Tan, C. Wang, B. Li, Q. Li, W. Ouyang, C. Yin, and J. Yan, “Equalization loss for long-tailed object recognition,” inPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. 3

  61. [61]

    Pharos-afe-aimi: Multi-source & fair disease diagnosis,

    D. Kollias, A. Arsenos, and S. Kollias, “Pharos-afe-aimi: Multi-source & fair disease diagnosis,” inProceedings of the IEEE/CVF International Conference on Computer Vi- sion, 2025, pp. 7265–7273. 6, 7

  62. [62]

    Albumentations: Fast and flexible image augmentations,

    A. Buslaev, V . I. Iglovikov, E. Khvedchenya, A. Parinov, M. Druzhinin, and A. A. Kalinin, “Albumentations: Fast and flexible image augmentations,” inInformation, vol. 11, no. 2, 2020, p. 125. 7

  63. [63]

    Adam: A Method for Stochastic Optimization

    D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”arXiv preprint arXiv:1412.6980, 2015. 7