pith. sign in

arxiv: 2605.17456 · v1 · pith:X33C4SLMnew · submitted 2026-05-17 · 💻 cs.CV · cs.AI

GCE-MIL: Faithful and Recoverable Evidence for Multiple Instance Learning in Whole-Slide Imaging

Pith reviewed 2026-05-20 14:30 UTC · model grok-4.3

classification 💻 cs.CV cs.AI
keywords multiple instance learningwhole-slide imagingevidence qualityattention mechanismssufficiency necessity recoverabilitydigital pathologyfaithful explanations
0
0 comments X

The pith

GCE-MIL directly optimizes evidence for sufficiency, necessity and recoverability in whole-slide MIL instead of treating attention weights as a byproduct of classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard attention models in multiple instance learning for whole-slide images optimize only for slide-level accuracy, which produces patches that fail to support the diagnosis when kept alone, fail to change the prediction when removed, and fail to match the discrete selections used at inference. The paper claims that evidence quality improves when these three properties are optimized explicitly through dedicated components rather than inherited indirectly. GCE-MIL implements this as a wrapper around existing backbones using a grounding step tied to domain concepts, a noisy-OR term as a differentiable stand-in for coverage search, and a marginal-guided repair step that turns continuous scores into discrete patch sets. Across nine backbones and nine datasets the approach raises average Macro-F1 by 0.024 and C-index by 0.014, shrinks the continuous-discrete mismatch by four to seven points, and allows optional prefiltering that speeds inference up to five times while preserving nearly all utility.

Core claim

Evidence quality in MIL for whole-slide imaging is improved by optimizing directly for Sufficiency, Necessity, and Recoverability through three injection modes and three evidence components—grounding, noisy-OR coverage, and threshold-plus-repair recovery—rather than relying on attention optimized solely for classification.

What carries the argument

GCE-MIL wrapper consisting of a grounding mechanism that aligns selection with domain concepts, noisy-OR coverage as a differentiable proxy for interventional evidence, and threshold-plus-repair recovery that converts continuous attention into discrete recoverable subsets.

If this is right

  • Keeping only the selected patches maintains nearly the same Macro-F1 as the full slide.
  • Removing the selected patches produces a substantially larger change in the slide-level prediction.
  • Continuous attention scores become consistent with the discrete patch subsets actually used at inference.
  • Tile prefiltering after discrete recovery yields up to 5x faster inference while retaining 0.989 of full-bag utility.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same explicit S/N/R criteria could be ported to attention models outside digital pathology where explanations must remain faithful to inference behavior.
  • Recoverability may matter most in settings that require auditability or regulatory review of which exact patches drove a decision.
  • The reported gains rest on 81 configurations; larger-scale tests on unseen tissue types would clarify whether the wrapper generalizes without retuning.

Load-bearing premise

The three injection modes and evidence components can be added to arbitrary backbones without introducing new selection biases or requiring dataset-specific tuning.

What would settle it

Applying GCE-MIL to a new backbone or dataset outside the tested set and observing that Macro-F1 drops, the continuous-discrete gap widens, or inference utility falls below 0.989 would falsify the claim that the method reliably produces faithful evidence.

Figures

Figures reproduced from arXiv: 2605.17456 by Ran Su, Xiangyu Li.

Figure 1
Figure 1. Figure 1: Three evidence failures in classification-optimized MIL. (Left) Attention top-k achieves only 0.640 keep-only Macro-F1 vs. GCE 0.722: attention is not sufficient evidence. (Middle) The continuous selector becomes bimodal during training, enabling discretization with C-D gap 0.004. (Right) Adding GCE preserves 0.99× full-bag performance across backbones. Attention-regularized variants such as ACMIL, AEM, an… view at source ↗
Figure 2
Figure 2. Figure 2: GCE-MIL architecture. The framework wraps existing MIL backbones with three components: (1) low-rank adapter and semantic bridge for anchor grounding (drives Necessity via concept coverage), (2) continuous selector with exact noisy-OR coverage (drives Sufficiency via multi-source evidence), (3) threshold-plus-repair discrete recovery (drives Recoverability via the same marginal coverage objective). The hos… view at source ↗
Figure 3
Figure 3. Figure 3: Main qualitative evidence example. The host attention map and the recovered GCE evidence subset are shown on the same slide. GCE selects a compact, recoverable evidence set rather than simply visualizing the original attention ranking [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: T-SNE before evidence grounding. Slide representations are less separated before the GCE evidence objective is applied. Epoch 1 Epoch 3 Epoch 5 Epoch 7 Epoch 9 Epoch 11 Epoch 13 Epoch 15 [PITH_FULL_IMAGE:figures/full_fig_p029_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: T-SNE after evidence grounding. GCE training produces more separated slide representa￾tions, consistent with the classification and evidence diagnostics. 29 [PITH_FULL_IMAGE:figures/full_fig_p029_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Mechanism 1: feature adaptation. Low-rank residual adaptation keeps the pretrained feature space close to UNI while improving selector compatibility [PITH_FULL_IMAGE:figures/full_fig_p030_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Mechanism 2: anchor response. The bridge maps patch features into TITAN anchor space and produces patch-anchor responses used by the coverage utility. 30 [PITH_FULL_IMAGE:figures/full_fig_p030_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Mechanism 3: continuous gate. Temperature annealing sharpens the selector distribution so that continuous gates can be recovered as a discrete subset [PITH_FULL_IMAGE:figures/full_fig_p031_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Mechanism 4: noisy-OR utility. Exact noisy-OR coverage gives diminishing returns once an anchor is already covered, encouraging complementary evidence. 31 [PITH_FULL_IMAGE:figures/full_fig_p031_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Mechanism 5: threshold recovery. The initial discrete subset is obtained by thresholding the continuous gate and falling back to the top patch when needed [PITH_FULL_IMAGE:figures/full_fig_p032_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Mechanism 6: greedy repair. Repair adds patches according to exact marginal utility until coverage and sufficiency criteria are restored. The following qualitative overlays show representative evidence maps exported from the evaluation pipeline. They are not used to claim pixel-level correctness; the quantitative support for evidence quality comes from keep-only, remove, C-D gap, and CAMELYON-16 localizat… view at source ↗
Figure 12
Figure 12. Figure 12: Qualitative evidence overlay 1. Representative slide-level attention and recovered GCE evidence [PITH_FULL_IMAGE:figures/full_fig_p033_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Qualitative evidence overlay 2. Representative slide-level attention and recovered GCE evidence [PITH_FULL_IMAGE:figures/full_fig_p033_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Qualitative evidence overlay 3. Representative slide-level attention and recovered GCE evidence. 33 [PITH_FULL_IMAGE:figures/full_fig_p033_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Qualitative evidence overlay 4. Representative slide-level attention and recovered GCE evidence [PITH_FULL_IMAGE:figures/full_fig_p034_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Qualitative evidence overlay 5. Representative slide-level attention and recovered GCE evidence [PITH_FULL_IMAGE:figures/full_fig_p034_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Qualitative evidence overlay 6. Representative slide-level attention and recovered GCE evidence. M Limitations and Discussion The S/N/R formalization operates at patch level; extending it to pixel-level or region-level evidence remains open. The anchor bank uses fixed text prompts that may not transfer to rare or previously unseen tissue types without prompt adaptation. Finally, S/N/R measures model-relat… view at source ↗
read the original abstract

Multiple instance learning (MIL) is the standard approach for whole-slide image (WSI) classification and survival prediction, where attention-based models ag gregate patch features into slide-level predictions. These models treat attention weights as evidence for their predictions, but attention is optimized for classi fication, not for identifying which patches actually support the diagnosis. This conflation leads to three failures: selected patches are insufficient (keeping them alone drops Macro-F1 by 0.078), unnecessary (removing them barely changes the prediction), and unrecoverable (continuous attention scores disagree with discrete patch subsets used at inference). The central premise is that evidence quality should be optimized directly through explicit criteria- Sufficiency, Necessity, and Recov erability (S/N/R)- rather than inherited as a byproduct of classification. GCE-MIL is a backbone-agnostic wrapper implemented through three injection modes and three evidence components: a grounding mechanism that aligns selection with domain-specific concepts, noisy-OR coverage that acts as a differentiable proxy for interventional evidence search, and threshold-plus-repair recovery that converts continuous selectors into discrete subsets through marginal-guided repair. Across 9 backbones and 9 datasets (81 configurations), GCE-MIL improves average Macro-F1 by 0.024 and C-index by 0.014, reduces the continuous-discrete gap by 4-7, and increases complement degradation by 2-4. With optional tile prefiltering after discrete recovery, inference runs up to 5 faster while retaining 0.989 full-bag utility.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces GCE-MIL, a backbone-agnostic wrapper for multiple instance learning (MIL) models used in whole-slide image (WSI) classification and survival prediction. It identifies failures in standard attention-based evidence (insufficient, unnecessary, unrecoverable) and proposes to optimize directly for Sufficiency, Necessity, and Recoverability (S/N/R) via three components: grounding to domain concepts, noisy-OR coverage as a differentiable proxy, and threshold-plus-repair for converting continuous to discrete. Across 9 backbones and 9 datasets yielding 81 configurations, it reports average Macro-F1 improvement of 0.024, C-index of 0.014, continuous-discrete gap reduction of 4-7, and complement degradation increase of 2-4, with optional prefiltering for up to 5x faster inference retaining 0.989 utility.

Significance. If the results hold, this could be a significant contribution to computational pathology by providing a way to generate more faithful evidence in MIL without sacrificing classification performance. The explicit S/N/R criteria and the large-scale evaluation across many configurations are positive aspects. The method's potential to improve both accuracy and interpretability makes it relevant for clinical applications where evidence quality matters.

major comments (3)
  1. Abstract: The abstract states that selected patches are insufficient as keeping them alone drops Macro-F1 by 0.078, but does not specify the exact procedure for measuring sufficiency or necessity (e.g., how the subset is chosen, what threshold is used), which is load-bearing for the central claim that S/N/R optimization improves evidence quality.
  2. Abstract: The backbone-agnostic claim and consistent gains across 81 configurations are central, yet the grounding mechanism's reliance on domain-specific concepts is not shown to be free of dataset-specific tuning or selection biases, as noted in the potential for implicit calibration in the repair step.
  3. Abstract: The reduction in continuous-discrete gap by 4-7 and the increase in complement degradation by 2-4 are reported without error bars, confidence intervals, or details on the exact metrics used for the gap and degradation, making it hard to evaluate the statistical robustness of these improvements.
minor comments (2)
  1. Abstract: Typographical errors: 'ag gregate' should be 'aggregate' and 'Recov erability' should be 'Recoverability'.
  2. Abstract: The three injection modes are mentioned but not briefly described, which would help readers understand how they implement the evidence components.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments and the positive assessment of GCE-MIL's potential significance. We address each major comment point by point below with clarifications and proposed revisions.

read point-by-point responses
  1. Referee: Abstract: The abstract states that selected patches are insufficient as keeping them alone drops Macro-F1 by 0.078, but does not specify the exact procedure for measuring sufficiency or necessity (e.g., how the subset is chosen, what threshold is used), which is load-bearing for the central claim that S/N/R optimization improves evidence quality.

    Authors: We agree that the abstract would benefit from greater precision on the measurement procedures. Sufficiency is measured by evaluating model performance (Macro-F1 or C-index) when the input bag is restricted to only the recovered discrete patch subset; necessity is measured by the performance change when the selected subset is removed from the full bag. The subset is obtained via the threshold-plus-repair procedure described in Section 3.3. We will revise the abstract to include a concise description of these procedures while referring readers to the Methods for full implementation details. revision: yes

  2. Referee: Abstract: The backbone-agnostic claim and consistent gains across 81 configurations are central, yet the grounding mechanism's reliance on domain-specific concepts is not shown to be free of dataset-specific tuning or selection biases, as noted in the potential for implicit calibration in the repair step.

    Authors: The grounding component maps patches to a fixed vocabulary of general pathology concepts (e.g., tumor epithelium, stroma, necrosis) that are applied uniformly across all nine datasets without per-dataset selection or hyperparameter tuning. The repair step is a deterministic, marginal-guided procedure that operates on the continuous scores and does not perform dataset-specific calibration. We will add a clarifying paragraph and supporting ablation in the revision demonstrating that the reported gains persist when grounding concepts are held constant, thereby reinforcing the backbone- and dataset-agnostic character of the wrapper. revision: yes

  3. Referee: Abstract: The reduction in continuous-discrete gap by 4-7 and the increase in complement degradation by 2-4 are reported without error bars, confidence intervals, or details on the exact metrics used for the gap and degradation, making it hard to evaluate the statistical robustness of these improvements.

    Authors: We will include standard deviations across the 81 configurations and 95% confidence intervals for these aggregate improvements in the revised abstract and results tables. The continuous-discrete gap is the absolute difference in downstream performance between continuous attention aggregation and the discrete recovered subset; complement degradation is the performance drop observed when the model is evaluated on the complement (non-selected) patches. Precise definitions and the requested statistical details will be added. revision: yes

Circularity Check

0 steps flagged

No significant circularity: S/N/R criteria and GCE-MIL components are defined externally to the reported empirical gains.

full rationale

The paper defines Sufficiency, Necessity, and Recoverability as explicit optimization targets separate from the classification loss, then implements them via three injection modes (grounding, noisy-OR coverage, threshold-plus-repair) as a backbone-agnostic wrapper. Reported gains (0.024 Macro-F1, 0.014 C-index, 4-7 gap reduction) are measured outcomes from 81 experimental configurations across 9 backbones and 9 datasets, not quantities forced by construction inside the method's own equations or self-citations. No load-bearing step reduces a claimed result to a fitted parameter or prior self-citation that itself assumes the target outcome; the derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach rests on the untested premise that noisy-OR can serve as a differentiable proxy for interventional evidence search and that marginal-guided repair reliably converts continuous scores to discrete subsets without loss of utility.

free parameters (1)
  • threshold for discrete recovery
    Used to convert continuous attention into binary patch selection; value not stated in abstract.
axioms (1)
  • domain assumption Attention weights can be meaningfully aligned with domain-specific concepts via the grounding mechanism
    Invoked when describing the grounding component.

pith-pipeline@v0.9.0 · 5814 in / 1340 out tokens · 39783 ms · 2026-05-20T14:30:38.042249+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

78 extracted references · 78 canonical work pages · 4 internal anchors

  1. [1]

    IEEE transactions on cybernetics , volume=

    Weakly supervised deep learning for whole slide lung cancer image analysis , author=. IEEE transactions on cybernetics , volume=. 2019 , publisher=

  2. [2]

    Nature medicine , volume=

    Clinical-grade computational pathology using weakly supervised deep learning on whole slide images , author=. Nature medicine , volume=. 2019 , publisher=

  3. [3]

    IEEE Transactions on Circuits and Systems for Video Technology , volume=

    Rethinking multiple instance learning for whole slide image classification: A good instance classifier is all you need , author=. IEEE Transactions on Circuits and Systems for Video Technology , volume=. 2024 , publisher=

  4. [4]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Visual language pretrained multiple instance zero-shot transfer for histopathology images , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  5. [5]

    Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

    Gecko: Gigapixel vision-concept contrastive pretraining in histopathology , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

  6. [6]

    An attention-based multi-resolution model for prostate whole slide imageclassification and localization

    An attention-based multi-resolution model for prostate whole slide imageclassification and localization , author=. arXiv preprint arXiv:1905.13208 , year=

  7. [7]

    Nature biomedical engineering , volume=

    Data-efficient and weakly supervised computational pathology on whole-slide images , author=. Nature biomedical engineering , volume=. 2021 , publisher=

  8. [8]

    Nature communications , volume=

    An annotation-free whole-slide training approach to pathological classification of lung cancer types using deep learning , author=. Nature communications , volume=. 2021 , publisher=

  9. [9]

    Proceedings of Machine Learning Research , volume=

    Do Multiple Instance Learning Models Transfer? , author=. Proceedings of Machine Learning Research , volume=. 2025 , publisher=

  10. [10]

    European conference on computer vision , pages=

    Attention-challenging multiple instance learning for whole slide image classification , author=. European conference on computer vision , pages=. 2024 , organization=

  11. [11]

    International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

    Higt: Hierarchical interaction graph-transformer for whole slide image analysis , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2023 , organization=

  12. [12]

    arXiv preprint arXiv:2411.18225 , year=

    Paths: A hierarchical transformer for efficient whole slide image analysis , author=. arXiv preprint arXiv:2411.18225 , year=

  13. [13]

    The Twelfth International Conference on Learning Representations , year=

    Olga Fourkioti and Matt. The Twelfth International Conference on Learning Representations , year=

  14. [14]

    International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

    AEM: attention entropy maximization for multiple instance learning based whole slide image classification , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2025 , organization=

  15. [15]

    International conference on machine learning , pages=

    Attention-based deep multiple instance learning , author=. International conference on machine learning , pages=. 2018 , organization=

  16. [16]

    The journal of machine learning research , volume=

    Dropout: a simple way to prevent neural networks from overfitting , author=. The journal of machine learning research , volume=. 2014 , publisher=

  17. [17]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Momentum contrast for unsupervised visual representation learning , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  18. [18]

    Proceedings of the IEEE/CVF international conference on computer vision , pages=

    Emerging properties in self-supervised vision transformers , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

  19. [19]

    The Bell system technical journal , volume=

    A mathematical theory of communication , author=. The Bell system technical journal , volume=. 1948 , publisher=

  20. [20]

    Database , volume=

    Bracs: A dataset for breast carcinoma subtyping in h&e histology images , author=. Database , volume=. 2022 , publisher=

  21. [21]

    IEEE transactions on medical imaging , volume=

    From detection of individual metastases to classification of lymph node status at the patient level: the camelyon17 challenge , author=. IEEE transactions on medical imaging , volume=. 2018 , publisher=

  22. [22]

    Jama , volume=

    Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer , author=. Jama , volume=

  23. [23]

    Nature Reviews Bioengineering , volume=

    Artificial intelligence for digital and computational pathology , author=. Nature Reviews Bioengineering , volume=. 2023 , publisher=

  24. [24]

    The Journal of pathology , volume=

    Computational pathology in cancer diagnosis, prognosis, and prediction--present day and prospects , author=. The Journal of pathology , volume=. 2023 , publisher=

  25. [25]

    Journal of pathology informatics , volume=

    Review of the current state of whole slide imaging in pathology , author=. Journal of pathology informatics , volume=. 2011 , publisher=

  26. [26]

    Artificial intelligence , volume=

    Solving the multiple instance problem with axis-parallel rectangles , author=. Artificial intelligence , volume=. 1997 , publisher=

  27. [27]

    Advances in neural information processing systems , volume=

    Transmil: Transformer based correlated multiple instance learning for whole slide image classification , author=. Advances in neural information processing systems , volume=

  28. [28]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Dtfd-mil: Double-tier feature distillation multiple instance learning for histopathology whole slide image classification , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  29. [29]

    Xiong, Yunyang and Zeng, Zhanpeng and Chakraborty, Rudrasis and Tan, Mingxing and Fung, Glenn and Li, Yin and Singh, Vikas , booktitle=. Nystr

  30. [30]

    Frontiers in Oncology , volume=

    Computational image analysis identifies histopathological image features associated with somatic mutations and patient survival in gastric adenocarcinoma , author=. Frontiers in Oncology , volume=. 2021 , publisher=

  31. [31]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    Benchmarking self-supervised learning on diverse pathology datasets , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

  32. [32]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  33. [33]

    Medical image analysis , volume=

    Weakly supervised histopathology cancer image segmentation and classification , author=. Medical image analysis , volume=. 2014 , publisher=

  34. [34]

    Bioinformatics , volume=

    Classifying and segmenting microscopy images with deep multiple instance learning , author=. Bioinformatics , volume=. 2016 , publisher=

  35. [35]

    Distilling the Knowledge in a Neural Network

    Distilling the knowledge in a neural network , author=. arXiv preprint arXiv:1503.02531 , year=

  36. [36]

    Medical Image Analysis , volume=

    Ms-clam: Mixed supervision for the classification and localization of tumors in whole slide images , author=. Medical Image Analysis , volume=. 2023 , publisher=

  37. [37]

    International conference on machine learning , pages=

    From softmax to sparsemax: A sparse model of attention and multi-label classification , author=. International conference on machine learning , pages=. 2016 , organization=

  38. [38]

    Journal of statistical physics , volume=

    Possible generalization of Boltzmann-Gibbs statistics , author=. Journal of statistical physics , volume=. 1988 , publisher=

  39. [39]

    Advances in neural information processing systems , volume=

    Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results , author=. Advances in neural information processing systems , volume=

  40. [40]

    Transactions on Machine Learning Research Journal , year=

    DINOv2: Learning Robust Visual Features without Supervision , author=. Transactions on Machine Learning Research Journal , year=

  41. [41]

    Nature medicine , volume=

    Towards a general-purpose foundation model for computational pathology , author=. Nature medicine , volume=. 2024 , publisher=

  42. [42]

    Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

    Deep residual learning for image recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

  43. [43]

    International journal of computer vision , volume=

    Imagenet large scale visual recognition challenge , author=. International journal of computer vision , volume=. 2015 , publisher=

  44. [44]

    International conference on medical image computing and computer-assisted intervention , pages=

    Mambamil: Enhancing long sequence modeling with sequence reordering in computational pathology , author=. International conference on medical image computing and computer-assisted intervention , pages=. 2024 , organization=

  45. [45]

    , author=

    Visualizing data using t-SNE. , author=. Journal of machine learning research , volume=

  46. [46]

    International conference on machine learning , pages=

    A simple framework for contrastive learning of visual representations , author=. International conference on machine learning , pages=. 2020 , organization=

  47. [47]

    Advances in neural information processing systems , volume=

    A framework for multiple-instance learning , author=. Advances in neural information processing systems , volume=

  48. [48]

    The Journal of the Acoustical Society of America , volume=

    The FROC curve: A representation of the observer's performance for the method of free response , author=. The Journal of the Acoustical Society of America , volume=. 1969 , publisher=

  49. [49]

    , author=

    Free response approach to measurement and characterization of radiographic observer performance. , author=. AJR Am J Roentgenol , volume=

  50. [50]

    The Thirteenth International Conference on Learning Representations , year=

    Rethinking multiple-instance learning from feature space to probability space , author=. The Thirteenth International Conference on Learning Representations , year=

  51. [51]

    Adam: A Method for Stochastic Optimization

    Adam: A method for stochastic optimization , author=. arXiv preprint arXiv:1412.6980 , year=

  52. [52]

    Medical image analysis , volume=

    Whole slide images based cancer survival prediction using attention guided deep multiple instance learning networks , author=. Medical image analysis , volume=. 2020 , publisher=

  53. [53]

    International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

    Whole slide images are 2d point clouds: Context-aware survival prediction using patch-based graph convolutional networks , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2021 , organization=

  54. [54]

    arXiv preprint arXiv:2603.06658 , year=

    ASMIL: Attention-Stabilized Multiple Instance Learning for Whole Slide Imaging , author=. arXiv preprint arXiv:2603.06658 , year=

  55. [55]

    The Fourteenth International Conference on Learning Representations , year=

    ASMIL: Attention-Stabilized Multiple Instance Learning for Whole-Slide Imaging , author=. The Fourteenth International Conference on Learning Representations , year=

  56. [56]

    Nature medicine , volume=

    Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge , author=. Nature medicine , volume=. 2022 , publisher=

  57. [57]

    Nature genetics , volume=

    The cancer genome atlas pan-cancer analysis project , author=. Nature genetics , volume=. 2013 , publisher=

  58. [58]

    Nature Medicine , pages=

    A multimodal whole-slide foundation model for pathology , author=. Nature Medicine , pages=. 2025 , publisher=

  59. [59]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Interventional bag multi-instance learning on whole-slide pathological images , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

  60. [60]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    Fast and accurate gigapixel pathological image classification with hierarchical distillation multi-instance learning , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

  61. [61]

    International Conference on Learning Representations , year=

    Learning Sparse Neural Networks through L\_0 Regularization , author=. International Conference on Learning Representations , year=

  62. [62]

    Proceedings of the international conference on learning Representations , year=

    The concrete distribution: A continuous relaxation of discrete random variables , author=. Proceedings of the international conference on learning Representations , year=

  63. [63]

    International Conference on Learning Representations , year=

    Categorical Reparameterization with Gumbel-Softmax , author=. International Conference on Learning Representations , year=

  64. [64]

    Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

    Deep inside convolutional networks: Visualising image classification models and saliency maps , author=. arXiv preprint arXiv:1312.6034 , year=

  65. [65]

    International conference on machine learning , pages=

    Axiomatic attribution for deep networks , author=. International conference on machine learning , pages=. 2017 , organization=

  66. [66]

    European conference on computer vision , pages=

    Visualizing and understanding convolutional networks , author=. European conference on computer vision , pages=. 2014 , organization=

  67. [67]

    Mathematical programming , volume=

    An analysis of approximations for maximizing submodular set functions—I , author=. Mathematical programming , volume=. 1978 , publisher=

  68. [68]

    International conference on machine learning , pages=

    Learning to explain: An information-theoretic perspective on model interpretation , author=. International conference on machine learning , pages=. 2018 , organization=

  69. [69]

    International conference on learning representations , year=

    INVASE: Instance-wise variable selection using neural networks , author=. International conference on learning representations , year=

  70. [70]

    Proceedings of the IEEE international conference on computer vision , pages=

    Interpretable explanations of black boxes by meaningful perturbation , author=. Proceedings of the IEEE international conference on computer vision , pages=

  71. [71]

    Proceedings of the British Machine Vision Conference (BMVC) , year =

    RISE: Randomized Input Sampling for Explanation of Black-box Models , author =. Proceedings of the British Machine Vision Conference (BMVC) , year =

  72. [72]

    Proceedings of the 58th annual meeting of the association for computational linguistics , pages=

    ERASER: A benchmark to evaluate rationalized NLP models , author=. Proceedings of the 58th annual meeting of the association for computational linguistics , pages=

  73. [73]

    Advances in neural information processing systems , volume=

    Towards robust interpretability with self-explaining neural networks , author=. Advances in neural information processing systems , volume=

  74. [74]

    Advances in neural information processing systems , volume=

    This looks like that: deep learning for interpretable image recognition , author=. Advances in neural information processing systems , volume=

  75. [75]

    International conference on machine learning , pages=

    Concept bottleneck models , author=. International conference on machine learning , pages=. 2020 , organization=

  76. [76]

    International conference on machine learning , pages=

    Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav) , author=. International conference on machine learning , pages=. 2018 , organization=

  77. [77]

    Proceedings of the IEEE/CVF international conference on computer vision , pages=

    Multiple instance learning framework with masked hard instance mining for whole slide image classification , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

  78. [78]

    1999 , publisher=

    Elements of information theory , author=. 1999 , publisher=