GCE-MIL: Faithful and Recoverable Evidence for Multiple Instance Learning in Whole-Slide Imaging

Ran Su; Xiangyu Li

arxiv: 2605.17456 · v1 · pith:X33C4SLMnew · submitted 2026-05-17 · 💻 cs.CV · cs.AI

GCE-MIL: Faithful and Recoverable Evidence for Multiple Instance Learning in Whole-Slide Imaging

Xiangyu Li , Ran Su This is my paper

Pith reviewed 2026-05-20 14:30 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords multiple instance learningwhole-slide imagingevidence qualityattention mechanismssufficiency necessity recoverabilitydigital pathologyfaithful explanations

0 comments

The pith

GCE-MIL directly optimizes evidence for sufficiency, necessity and recoverability in whole-slide MIL instead of treating attention weights as a byproduct of classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard attention models in multiple instance learning for whole-slide images optimize only for slide-level accuracy, which produces patches that fail to support the diagnosis when kept alone, fail to change the prediction when removed, and fail to match the discrete selections used at inference. The paper claims that evidence quality improves when these three properties are optimized explicitly through dedicated components rather than inherited indirectly. GCE-MIL implements this as a wrapper around existing backbones using a grounding step tied to domain concepts, a noisy-OR term as a differentiable stand-in for coverage search, and a marginal-guided repair step that turns continuous scores into discrete patch sets. Across nine backbones and nine datasets the approach raises average Macro-F1 by 0.024 and C-index by 0.014, shrinks the continuous-discrete mismatch by four to seven points, and allows optional prefiltering that speeds inference up to five times while preserving nearly all utility.

Core claim

Evidence quality in MIL for whole-slide imaging is improved by optimizing directly for Sufficiency, Necessity, and Recoverability through three injection modes and three evidence components—grounding, noisy-OR coverage, and threshold-plus-repair recovery—rather than relying on attention optimized solely for classification.

What carries the argument

GCE-MIL wrapper consisting of a grounding mechanism that aligns selection with domain concepts, noisy-OR coverage as a differentiable proxy for interventional evidence, and threshold-plus-repair recovery that converts continuous attention into discrete recoverable subsets.

If this is right

Keeping only the selected patches maintains nearly the same Macro-F1 as the full slide.
Removing the selected patches produces a substantially larger change in the slide-level prediction.
Continuous attention scores become consistent with the discrete patch subsets actually used at inference.
Tile prefiltering after discrete recovery yields up to 5x faster inference while retaining 0.989 of full-bag utility.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same explicit S/N/R criteria could be ported to attention models outside digital pathology where explanations must remain faithful to inference behavior.
Recoverability may matter most in settings that require auditability or regulatory review of which exact patches drove a decision.
The reported gains rest on 81 configurations; larger-scale tests on unseen tissue types would clarify whether the wrapper generalizes without retuning.

Load-bearing premise

The three injection modes and evidence components can be added to arbitrary backbones without introducing new selection biases or requiring dataset-specific tuning.

What would settle it

Applying GCE-MIL to a new backbone or dataset outside the tested set and observing that Macro-F1 drops, the continuous-discrete gap widens, or inference utility falls below 0.989 would falsify the claim that the method reliably produces faithful evidence.

Figures

Figures reproduced from arXiv: 2605.17456 by Ran Su, Xiangyu Li.

**Figure 1.** Figure 1: Three evidence failures in classification-optimized MIL. (Left) Attention top-k achieves only 0.640 keep-only Macro-F1 vs. GCE 0.722: attention is not sufficient evidence. (Middle) The continuous selector becomes bimodal during training, enabling discretization with C-D gap 0.004. (Right) Adding GCE preserves 0.99× full-bag performance across backbones. Attention-regularized variants such as ACMIL, AEM, an… view at source ↗

**Figure 2.** Figure 2: GCE-MIL architecture. The framework wraps existing MIL backbones with three components: (1) low-rank adapter and semantic bridge for anchor grounding (drives Necessity via concept coverage), (2) continuous selector with exact noisy-OR coverage (drives Sufficiency via multi-source evidence), (3) threshold-plus-repair discrete recovery (drives Recoverability via the same marginal coverage objective). The hos… view at source ↗

**Figure 3.** Figure 3: Main qualitative evidence example. The host attention map and the recovered GCE evidence subset are shown on the same slide. GCE selects a compact, recoverable evidence set rather than simply visualizing the original attention ranking [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: T-SNE before evidence grounding. Slide representations are less separated before the GCE evidence objective is applied. Epoch 1 Epoch 3 Epoch 5 Epoch 7 Epoch 9 Epoch 11 Epoch 13 Epoch 15 [PITH_FULL_IMAGE:figures/full_fig_p029_4.png] view at source ↗

**Figure 5.** Figure 5: T-SNE after evidence grounding. GCE training produces more separated slide representations, consistent with the classification and evidence diagnostics. 29 [PITH_FULL_IMAGE:figures/full_fig_p029_5.png] view at source ↗

**Figure 6.** Figure 6: Mechanism 1: feature adaptation. Low-rank residual adaptation keeps the pretrained feature space close to UNI while improving selector compatibility [PITH_FULL_IMAGE:figures/full_fig_p030_6.png] view at source ↗

**Figure 7.** Figure 7: Mechanism 2: anchor response. The bridge maps patch features into TITAN anchor space and produces patch-anchor responses used by the coverage utility. 30 [PITH_FULL_IMAGE:figures/full_fig_p030_7.png] view at source ↗

**Figure 8.** Figure 8: Mechanism 3: continuous gate. Temperature annealing sharpens the selector distribution so that continuous gates can be recovered as a discrete subset [PITH_FULL_IMAGE:figures/full_fig_p031_8.png] view at source ↗

**Figure 9.** Figure 9: Mechanism 4: noisy-OR utility. Exact noisy-OR coverage gives diminishing returns once an anchor is already covered, encouraging complementary evidence. 31 [PITH_FULL_IMAGE:figures/full_fig_p031_9.png] view at source ↗

**Figure 10.** Figure 10: Mechanism 5: threshold recovery. The initial discrete subset is obtained by thresholding the continuous gate and falling back to the top patch when needed [PITH_FULL_IMAGE:figures/full_fig_p032_10.png] view at source ↗

**Figure 11.** Figure 11: Mechanism 6: greedy repair. Repair adds patches according to exact marginal utility until coverage and sufficiency criteria are restored. The following qualitative overlays show representative evidence maps exported from the evaluation pipeline. They are not used to claim pixel-level correctness; the quantitative support for evidence quality comes from keep-only, remove, C-D gap, and CAMELYON-16 localizat… view at source ↗

**Figure 12.** Figure 12: Qualitative evidence overlay 1. Representative slide-level attention and recovered GCE evidence [PITH_FULL_IMAGE:figures/full_fig_p033_12.png] view at source ↗

**Figure 13.** Figure 13: Qualitative evidence overlay 2. Representative slide-level attention and recovered GCE evidence [PITH_FULL_IMAGE:figures/full_fig_p033_13.png] view at source ↗

**Figure 14.** Figure 14: Qualitative evidence overlay 3. Representative slide-level attention and recovered GCE evidence. 33 [PITH_FULL_IMAGE:figures/full_fig_p033_14.png] view at source ↗

**Figure 15.** Figure 15: Qualitative evidence overlay 4. Representative slide-level attention and recovered GCE evidence [PITH_FULL_IMAGE:figures/full_fig_p034_15.png] view at source ↗

**Figure 16.** Figure 16: Qualitative evidence overlay 5. Representative slide-level attention and recovered GCE evidence [PITH_FULL_IMAGE:figures/full_fig_p034_16.png] view at source ↗

**Figure 17.** Figure 17: Qualitative evidence overlay 6. Representative slide-level attention and recovered GCE evidence. M Limitations and Discussion The S/N/R formalization operates at patch level; extending it to pixel-level or region-level evidence remains open. The anchor bank uses fixed text prompts that may not transfer to rare or previously unseen tissue types without prompt adaptation. Finally, S/N/R measures model-relat… view at source ↗

read the original abstract

Multiple instance learning (MIL) is the standard approach for whole-slide image (WSI) classification and survival prediction, where attention-based models ag gregate patch features into slide-level predictions. These models treat attention weights as evidence for their predictions, but attention is optimized for classi fication, not for identifying which patches actually support the diagnosis. This conflation leads to three failures: selected patches are insufficient (keeping them alone drops Macro-F1 by 0.078), unnecessary (removing them barely changes the prediction), and unrecoverable (continuous attention scores disagree with discrete patch subsets used at inference). The central premise is that evidence quality should be optimized directly through explicit criteria- Sufficiency, Necessity, and Recov erability (S/N/R)- rather than inherited as a byproduct of classification. GCE-MIL is a backbone-agnostic wrapper implemented through three injection modes and three evidence components: a grounding mechanism that aligns selection with domain-specific concepts, noisy-OR coverage that acts as a differentiable proxy for interventional evidence search, and threshold-plus-repair recovery that converts continuous selectors into discrete subsets through marginal-guided repair. Across 9 backbones and 9 datasets (81 configurations), GCE-MIL improves average Macro-F1 by 0.024 and C-index by 0.014, reduces the continuous-discrete gap by 4-7, and increases complement degradation by 2-4. With optional tile prefiltering after discrete recovery, inference runs up to 5 faster while retaining 0.989 full-bag utility.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GCE-MIL wraps MIL backbones with explicit S/N/R targets and shows small consistent gains across 81 setups, but the abstract leaves measurement details and tuning risks unaddressed.

read the letter

The main takeaway is that GCE-MIL tries to fix a real mismatch in whole-slide imaging: attention weights optimized for classification often produce patches that are insufficient, unnecessary, or hard to recover as discrete evidence. The paper defines Sufficiency, Necessity, and Recoverability as direct targets and supplies a wrapper with grounding to domain concepts, noisy-OR coverage, and threshold-plus-repair to hit them. This combination of three named components appears new even if the individual pieces draw from earlier MIL work. The experiments cover nine backbones and nine datasets for 81 total configurations, which is a respectable scope. They report a 0.024 average Macro-F1 lift, 0.014 on C-index, and a 4-7 point reduction in the continuous-discrete gap, plus faster inference with optional prefiltering while retaining most utility. That broad testing and the concrete injection modes are the parts worth crediting. The noisy-OR step as a differentiable proxy for interventional search is a reasonable engineering choice. The soft spots sit in the missing specifics and possible hidden dependencies. The abstract gives no equations for how sufficiency or necessity are scored exactly, no error bars, and no controls for the post-hoc tile prefiltering. The stress-test note about grounding and marginal-guided thresholds embedding dataset-specific choices is plausible; if those steps need per-dataset calibration, the backbone-agnostic claim and the reported gains could partly reflect extra tuning rather than intrinsic S/N/R improvements. If the full paper supplies the derivations, ablations, and sensitivity checks, the central premise holds up better. This work is aimed at people building or deploying MIL systems for pathology who want more trustworthy patch selections without rewriting the core model. A reader focused on interpretability in medical imaging would get practical value from the three components and the scale of testing. It deserves a serious referee because the problem is common, the fix is straightforward to implement, and the experimental breadth is enough to justify closer review even with the current gaps in presentation.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces GCE-MIL, a backbone-agnostic wrapper for multiple instance learning (MIL) models used in whole-slide image (WSI) classification and survival prediction. It identifies failures in standard attention-based evidence (insufficient, unnecessary, unrecoverable) and proposes to optimize directly for Sufficiency, Necessity, and Recoverability (S/N/R) via three components: grounding to domain concepts, noisy-OR coverage as a differentiable proxy, and threshold-plus-repair for converting continuous to discrete. Across 9 backbones and 9 datasets yielding 81 configurations, it reports average Macro-F1 improvement of 0.024, C-index of 0.014, continuous-discrete gap reduction of 4-7, and complement degradation increase of 2-4, with optional prefiltering for up to 5x faster inference retaining 0.989 utility.

Significance. If the results hold, this could be a significant contribution to computational pathology by providing a way to generate more faithful evidence in MIL without sacrificing classification performance. The explicit S/N/R criteria and the large-scale evaluation across many configurations are positive aspects. The method's potential to improve both accuracy and interpretability makes it relevant for clinical applications where evidence quality matters.

major comments (3)

Abstract: The abstract states that selected patches are insufficient as keeping them alone drops Macro-F1 by 0.078, but does not specify the exact procedure for measuring sufficiency or necessity (e.g., how the subset is chosen, what threshold is used), which is load-bearing for the central claim that S/N/R optimization improves evidence quality.
Abstract: The backbone-agnostic claim and consistent gains across 81 configurations are central, yet the grounding mechanism's reliance on domain-specific concepts is not shown to be free of dataset-specific tuning or selection biases, as noted in the potential for implicit calibration in the repair step.
Abstract: The reduction in continuous-discrete gap by 4-7 and the increase in complement degradation by 2-4 are reported without error bars, confidence intervals, or details on the exact metrics used for the gap and degradation, making it hard to evaluate the statistical robustness of these improvements.

minor comments (2)

Abstract: Typographical errors: 'ag gregate' should be 'aggregate' and 'Recov erability' should be 'Recoverability'.
Abstract: The three injection modes are mentioned but not briefly described, which would help readers understand how they implement the evidence components.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments and the positive assessment of GCE-MIL's potential significance. We address each major comment point by point below with clarifications and proposed revisions.

read point-by-point responses

Referee: Abstract: The abstract states that selected patches are insufficient as keeping them alone drops Macro-F1 by 0.078, but does not specify the exact procedure for measuring sufficiency or necessity (e.g., how the subset is chosen, what threshold is used), which is load-bearing for the central claim that S/N/R optimization improves evidence quality.

Authors: We agree that the abstract would benefit from greater precision on the measurement procedures. Sufficiency is measured by evaluating model performance (Macro-F1 or C-index) when the input bag is restricted to only the recovered discrete patch subset; necessity is measured by the performance change when the selected subset is removed from the full bag. The subset is obtained via the threshold-plus-repair procedure described in Section 3.3. We will revise the abstract to include a concise description of these procedures while referring readers to the Methods for full implementation details. revision: yes
Referee: Abstract: The backbone-agnostic claim and consistent gains across 81 configurations are central, yet the grounding mechanism's reliance on domain-specific concepts is not shown to be free of dataset-specific tuning or selection biases, as noted in the potential for implicit calibration in the repair step.

Authors: The grounding component maps patches to a fixed vocabulary of general pathology concepts (e.g., tumor epithelium, stroma, necrosis) that are applied uniformly across all nine datasets without per-dataset selection or hyperparameter tuning. The repair step is a deterministic, marginal-guided procedure that operates on the continuous scores and does not perform dataset-specific calibration. We will add a clarifying paragraph and supporting ablation in the revision demonstrating that the reported gains persist when grounding concepts are held constant, thereby reinforcing the backbone- and dataset-agnostic character of the wrapper. revision: yes
Referee: Abstract: The reduction in continuous-discrete gap by 4-7 and the increase in complement degradation by 2-4 are reported without error bars, confidence intervals, or details on the exact metrics used for the gap and degradation, making it hard to evaluate the statistical robustness of these improvements.

Authors: We will include standard deviations across the 81 configurations and 95% confidence intervals for these aggregate improvements in the revised abstract and results tables. The continuous-discrete gap is the absolute difference in downstream performance between continuous attention aggregation and the discrete recovered subset; complement degradation is the performance drop observed when the model is evaluated on the complement (non-selected) patches. Precise definitions and the requested statistical details will be added. revision: yes

Circularity Check

0 steps flagged

No significant circularity: S/N/R criteria and GCE-MIL components are defined externally to the reported empirical gains.

full rationale

The paper defines Sufficiency, Necessity, and Recoverability as explicit optimization targets separate from the classification loss, then implements them via three injection modes (grounding, noisy-OR coverage, threshold-plus-repair) as a backbone-agnostic wrapper. Reported gains (0.024 Macro-F1, 0.014 C-index, 4-7 gap reduction) are measured outcomes from 81 experimental configurations across 9 backbones and 9 datasets, not quantities forced by construction inside the method's own equations or self-citations. No load-bearing step reduces a claimed result to a fitted parameter or prior self-citation that itself assumes the target outcome; the derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach rests on the untested premise that noisy-OR can serve as a differentiable proxy for interventional evidence search and that marginal-guided repair reliably converts continuous scores to discrete subsets without loss of utility.

free parameters (1)

threshold for discrete recovery
Used to convert continuous attention into binary patch selection; value not stated in abstract.

axioms (1)

domain assumption Attention weights can be meaningfully aligned with domain-specific concepts via the grounding mechanism
Invoked when describing the grounding component.

pith-pipeline@v0.9.0 · 5814 in / 1340 out tokens · 39783 ms · 2026-05-20T14:30:38.042249+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Noisy-OR coverage: vm(π) = 1−∏i(1−πi rim). ... Proposition 1 (Submodularity of Noisy-OR Coverage).
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Sufficiency, Necessity, and Recoverability (S/N/R) criteria for evidence quality.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

78 extracted references · 78 canonical work pages · 4 internal anchors

[1]

IEEE transactions on cybernetics , volume=

Weakly supervised deep learning for whole slide lung cancer image analysis , author=. IEEE transactions on cybernetics , volume=. 2019 , publisher=

work page 2019
[2]

Nature medicine , volume=

Clinical-grade computational pathology using weakly supervised deep learning on whole slide images , author=. Nature medicine , volume=. 2019 , publisher=

work page 2019
[3]

IEEE Transactions on Circuits and Systems for Video Technology , volume=

Rethinking multiple instance learning for whole slide image classification: A good instance classifier is all you need , author=. IEEE Transactions on Circuits and Systems for Video Technology , volume=. 2024 , publisher=

work page 2024
[4]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Visual language pretrained multiple instance zero-shot transfer for histopathology images , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[5]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Gecko: Gigapixel vision-concept contrastive pretraining in histopathology , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

work page
[6]

An attention-based multi-resolution model for prostate whole slide imageclassification and localization

An attention-based multi-resolution model for prostate whole slide imageclassification and localization , author=. arXiv preprint arXiv:1905.13208 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1905
[7]

Nature biomedical engineering , volume=

Data-efficient and weakly supervised computational pathology on whole-slide images , author=. Nature biomedical engineering , volume=. 2021 , publisher=

work page 2021
[8]

Nature communications , volume=

An annotation-free whole-slide training approach to pathological classification of lung cancer types using deep learning , author=. Nature communications , volume=. 2021 , publisher=

work page 2021
[9]

Proceedings of Machine Learning Research , volume=

Do Multiple Instance Learning Models Transfer? , author=. Proceedings of Machine Learning Research , volume=. 2025 , publisher=

work page 2025
[10]

European conference on computer vision , pages=

Attention-challenging multiple instance learning for whole slide image classification , author=. European conference on computer vision , pages=. 2024 , organization=

work page 2024
[11]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Higt: Hierarchical interaction graph-transformer for whole slide image analysis , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2023 , organization=

work page 2023
[12]

arXiv preprint arXiv:2411.18225 , year=

Paths: A hierarchical transformer for efficient whole slide image analysis , author=. arXiv preprint arXiv:2411.18225 , year=

work page arXiv
[13]

The Twelfth International Conference on Learning Representations , year=

Olga Fourkioti and Matt. The Twelfth International Conference on Learning Representations , year=

work page
[14]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

AEM: attention entropy maximization for multiple instance learning based whole slide image classification , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2025 , organization=

work page 2025
[15]

International conference on machine learning , pages=

Attention-based deep multiple instance learning , author=. International conference on machine learning , pages=. 2018 , organization=

work page 2018
[16]

The journal of machine learning research , volume=

Dropout: a simple way to prevent neural networks from overfitting , author=. The journal of machine learning research , volume=. 2014 , publisher=

work page 2014
[17]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Momentum contrast for unsupervised visual representation learning , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[18]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Emerging properties in self-supervised vision transformers , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

work page
[19]

The Bell system technical journal , volume=

A mathematical theory of communication , author=. The Bell system technical journal , volume=. 1948 , publisher=

work page 1948
[20]

Database , volume=

Bracs: A dataset for breast carcinoma subtyping in h&e histology images , author=. Database , volume=. 2022 , publisher=

work page 2022
[21]

IEEE transactions on medical imaging , volume=

From detection of individual metastases to classification of lymph node status at the patient level: the camelyon17 challenge , author=. IEEE transactions on medical imaging , volume=. 2018 , publisher=

work page 2018
[22]

Jama , volume=

Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer , author=. Jama , volume=

work page
[23]

Nature Reviews Bioengineering , volume=

Artificial intelligence for digital and computational pathology , author=. Nature Reviews Bioengineering , volume=. 2023 , publisher=

work page 2023
[24]

The Journal of pathology , volume=

Computational pathology in cancer diagnosis, prognosis, and prediction--present day and prospects , author=. The Journal of pathology , volume=. 2023 , publisher=

work page 2023
[25]

Journal of pathology informatics , volume=

Review of the current state of whole slide imaging in pathology , author=. Journal of pathology informatics , volume=. 2011 , publisher=

work page 2011
[26]

Artificial intelligence , volume=

Solving the multiple instance problem with axis-parallel rectangles , author=. Artificial intelligence , volume=. 1997 , publisher=

work page 1997
[27]

Advances in neural information processing systems , volume=

Transmil: Transformer based correlated multiple instance learning for whole slide image classification , author=. Advances in neural information processing systems , volume=

work page
[28]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Dtfd-mil: Double-tier feature distillation multiple instance learning for histopathology whole slide image classification , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[29]

Xiong, Yunyang and Zeng, Zhanpeng and Chakraborty, Rudrasis and Tan, Mingxing and Fung, Glenn and Li, Yin and Singh, Vikas , booktitle=. Nystr

work page
[30]

Frontiers in Oncology , volume=

Computational image analysis identifies histopathological image features associated with somatic mutations and patient survival in gastric adenocarcinoma , author=. Frontiers in Oncology , volume=. 2021 , publisher=

work page 2021
[31]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Benchmarking self-supervised learning on diverse pathology datasets , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[32]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[33]

Medical image analysis , volume=

Weakly supervised histopathology cancer image segmentation and classification , author=. Medical image analysis , volume=. 2014 , publisher=

work page 2014
[34]

Bioinformatics , volume=

Classifying and segmenting microscopy images with deep multiple instance learning , author=. Bioinformatics , volume=. 2016 , publisher=

work page 2016
[35]

Distilling the Knowledge in a Neural Network

Distilling the knowledge in a neural network , author=. arXiv preprint arXiv:1503.02531 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[36]

Medical Image Analysis , volume=

Ms-clam: Mixed supervision for the classification and localization of tumors in whole slide images , author=. Medical Image Analysis , volume=. 2023 , publisher=

work page 2023
[37]

International conference on machine learning , pages=

From softmax to sparsemax: A sparse model of attention and multi-label classification , author=. International conference on machine learning , pages=. 2016 , organization=

work page 2016
[38]

Journal of statistical physics , volume=

Possible generalization of Boltzmann-Gibbs statistics , author=. Journal of statistical physics , volume=. 1988 , publisher=

work page 1988
[39]

Advances in neural information processing systems , volume=

Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results , author=. Advances in neural information processing systems , volume=

work page
[40]

Transactions on Machine Learning Research Journal , year=

DINOv2: Learning Robust Visual Features without Supervision , author=. Transactions on Machine Learning Research Journal , year=

work page
[41]

Nature medicine , volume=

Towards a general-purpose foundation model for computational pathology , author=. Nature medicine , volume=. 2024 , publisher=

work page 2024
[42]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Deep residual learning for image recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

work page
[43]

International journal of computer vision , volume=

Imagenet large scale visual recognition challenge , author=. International journal of computer vision , volume=. 2015 , publisher=

work page 2015
[44]

International conference on medical image computing and computer-assisted intervention , pages=

Mambamil: Enhancing long sequence modeling with sequence reordering in computational pathology , author=. International conference on medical image computing and computer-assisted intervention , pages=. 2024 , organization=

work page 2024
[45]

, author=

Visualizing data using t-SNE. , author=. Journal of machine learning research , volume=

work page
[46]

International conference on machine learning , pages=

A simple framework for contrastive learning of visual representations , author=. International conference on machine learning , pages=. 2020 , organization=

work page 2020
[47]

Advances in neural information processing systems , volume=

A framework for multiple-instance learning , author=. Advances in neural information processing systems , volume=

work page
[48]

The Journal of the Acoustical Society of America , volume=

The FROC curve: A representation of the observer's performance for the method of free response , author=. The Journal of the Acoustical Society of America , volume=. 1969 , publisher=

work page 1969
[49]

, author=

Free response approach to measurement and characterization of radiographic observer performance. , author=. AJR Am J Roentgenol , volume=

work page
[50]

The Thirteenth International Conference on Learning Representations , year=

Rethinking multiple-instance learning from feature space to probability space , author=. The Thirteenth International Conference on Learning Representations , year=

work page
[51]

Adam: A Method for Stochastic Optimization

Adam: A method for stochastic optimization , author=. arXiv preprint arXiv:1412.6980 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[52]

Medical image analysis , volume=

Whole slide images based cancer survival prediction using attention guided deep multiple instance learning networks , author=. Medical image analysis , volume=. 2020 , publisher=

work page 2020
[53]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Whole slide images are 2d point clouds: Context-aware survival prediction using patch-based graph convolutional networks , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2021 , organization=

work page 2021
[54]

arXiv preprint arXiv:2603.06658 , year=

ASMIL: Attention-Stabilized Multiple Instance Learning for Whole Slide Imaging , author=. arXiv preprint arXiv:2603.06658 , year=

work page arXiv
[55]

The Fourteenth International Conference on Learning Representations , year=

ASMIL: Attention-Stabilized Multiple Instance Learning for Whole-Slide Imaging , author=. The Fourteenth International Conference on Learning Representations , year=

work page
[56]

Nature medicine , volume=

Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge , author=. Nature medicine , volume=. 2022 , publisher=

work page 2022
[57]

Nature genetics , volume=

The cancer genome atlas pan-cancer analysis project , author=. Nature genetics , volume=. 2013 , publisher=

work page 2013
[58]

Nature Medicine , pages=

A multimodal whole-slide foundation model for pathology , author=. Nature Medicine , pages=. 2025 , publisher=

work page 2025
[59]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Interventional bag multi-instance learning on whole-slide pathological images , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[60]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Fast and accurate gigapixel pathological image classification with hierarchical distillation multi-instance learning , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[61]

International Conference on Learning Representations , year=

Learning Sparse Neural Networks through L\_0 Regularization , author=. International Conference on Learning Representations , year=

work page
[62]

Proceedings of the international conference on learning Representations , year=

The concrete distribution: A continuous relaxation of discrete random variables , author=. Proceedings of the international conference on learning Representations , year=

work page
[63]

International Conference on Learning Representations , year=

Categorical Reparameterization with Gumbel-Softmax , author=. International Conference on Learning Representations , year=

work page
[64]

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

Deep inside convolutional networks: Visualising image classification models and saliency maps , author=. arXiv preprint arXiv:1312.6034 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[65]

International conference on machine learning , pages=

Axiomatic attribution for deep networks , author=. International conference on machine learning , pages=. 2017 , organization=

work page 2017
[66]

European conference on computer vision , pages=

Visualizing and understanding convolutional networks , author=. European conference on computer vision , pages=. 2014 , organization=

work page 2014
[67]

Mathematical programming , volume=

An analysis of approximations for maximizing submodular set functions—I , author=. Mathematical programming , volume=. 1978 , publisher=

work page 1978
[68]

International conference on machine learning , pages=

Learning to explain: An information-theoretic perspective on model interpretation , author=. International conference on machine learning , pages=. 2018 , organization=

work page 2018
[69]

International conference on learning representations , year=

INVASE: Instance-wise variable selection using neural networks , author=. International conference on learning representations , year=

work page
[70]

Proceedings of the IEEE international conference on computer vision , pages=

Interpretable explanations of black boxes by meaningful perturbation , author=. Proceedings of the IEEE international conference on computer vision , pages=

work page
[71]

Proceedings of the British Machine Vision Conference (BMVC) , year =

RISE: Randomized Input Sampling for Explanation of Black-box Models , author =. Proceedings of the British Machine Vision Conference (BMVC) , year =

work page
[72]

Proceedings of the 58th annual meeting of the association for computational linguistics , pages=

ERASER: A benchmark to evaluate rationalized NLP models , author=. Proceedings of the 58th annual meeting of the association for computational linguistics , pages=

work page
[73]

Advances in neural information processing systems , volume=

Towards robust interpretability with self-explaining neural networks , author=. Advances in neural information processing systems , volume=

work page
[74]

Advances in neural information processing systems , volume=

This looks like that: deep learning for interpretable image recognition , author=. Advances in neural information processing systems , volume=

work page
[75]

International conference on machine learning , pages=

Concept bottleneck models , author=. International conference on machine learning , pages=. 2020 , organization=

work page 2020
[76]

International conference on machine learning , pages=

Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav) , author=. International conference on machine learning , pages=. 2018 , organization=

work page 2018
[77]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Multiple instance learning framework with masked hard instance mining for whole slide image classification , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

work page
[78]

1999 , publisher=

Elements of information theory , author=. 1999 , publisher=

work page 1999

[1] [1]

IEEE transactions on cybernetics , volume=

Weakly supervised deep learning for whole slide lung cancer image analysis , author=. IEEE transactions on cybernetics , volume=. 2019 , publisher=

work page 2019

[2] [2]

Nature medicine , volume=

Clinical-grade computational pathology using weakly supervised deep learning on whole slide images , author=. Nature medicine , volume=. 2019 , publisher=

work page 2019

[3] [3]

IEEE Transactions on Circuits and Systems for Video Technology , volume=

Rethinking multiple instance learning for whole slide image classification: A good instance classifier is all you need , author=. IEEE Transactions on Circuits and Systems for Video Technology , volume=. 2024 , publisher=

work page 2024

[4] [4]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Visual language pretrained multiple instance zero-shot transfer for histopathology images , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page

[5] [5]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Gecko: Gigapixel vision-concept contrastive pretraining in histopathology , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

work page

[6] [6]

An attention-based multi-resolution model for prostate whole slide imageclassification and localization

An attention-based multi-resolution model for prostate whole slide imageclassification and localization , author=. arXiv preprint arXiv:1905.13208 , year=

work page internal anchor Pith review Pith/arXiv arXiv 1905

[7] [7]

Nature biomedical engineering , volume=

Data-efficient and weakly supervised computational pathology on whole-slide images , author=. Nature biomedical engineering , volume=. 2021 , publisher=

work page 2021

[8] [8]

Nature communications , volume=

An annotation-free whole-slide training approach to pathological classification of lung cancer types using deep learning , author=. Nature communications , volume=. 2021 , publisher=

work page 2021

[9] [9]

Proceedings of Machine Learning Research , volume=

Do Multiple Instance Learning Models Transfer? , author=. Proceedings of Machine Learning Research , volume=. 2025 , publisher=

work page 2025

[10] [10]

European conference on computer vision , pages=

Attention-challenging multiple instance learning for whole slide image classification , author=. European conference on computer vision , pages=. 2024 , organization=

work page 2024

[11] [11]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Higt: Hierarchical interaction graph-transformer for whole slide image analysis , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2023 , organization=

work page 2023

[12] [12]

arXiv preprint arXiv:2411.18225 , year=

Paths: A hierarchical transformer for efficient whole slide image analysis , author=. arXiv preprint arXiv:2411.18225 , year=

work page arXiv

[13] [13]

The Twelfth International Conference on Learning Representations , year=

Olga Fourkioti and Matt. The Twelfth International Conference on Learning Representations , year=

work page

[14] [14]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

AEM: attention entropy maximization for multiple instance learning based whole slide image classification , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2025 , organization=

work page 2025

[15] [15]

International conference on machine learning , pages=

Attention-based deep multiple instance learning , author=. International conference on machine learning , pages=. 2018 , organization=

work page 2018

[16] [16]

The journal of machine learning research , volume=

Dropout: a simple way to prevent neural networks from overfitting , author=. The journal of machine learning research , volume=. 2014 , publisher=

work page 2014

[17] [17]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Momentum contrast for unsupervised visual representation learning , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page

[18] [18]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Emerging properties in self-supervised vision transformers , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

work page

[19] [19]

The Bell system technical journal , volume=

A mathematical theory of communication , author=. The Bell system technical journal , volume=. 1948 , publisher=

work page 1948

[20] [20]

Database , volume=

Bracs: A dataset for breast carcinoma subtyping in h&e histology images , author=. Database , volume=. 2022 , publisher=

work page 2022

[21] [21]

IEEE transactions on medical imaging , volume=

From detection of individual metastases to classification of lymph node status at the patient level: the camelyon17 challenge , author=. IEEE transactions on medical imaging , volume=. 2018 , publisher=

work page 2018

[22] [22]

Jama , volume=

Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer , author=. Jama , volume=

work page

[23] [23]

Nature Reviews Bioengineering , volume=

Artificial intelligence for digital and computational pathology , author=. Nature Reviews Bioengineering , volume=. 2023 , publisher=

work page 2023

[24] [24]

The Journal of pathology , volume=

Computational pathology in cancer diagnosis, prognosis, and prediction--present day and prospects , author=. The Journal of pathology , volume=. 2023 , publisher=

work page 2023

[25] [25]

Journal of pathology informatics , volume=

Review of the current state of whole slide imaging in pathology , author=. Journal of pathology informatics , volume=. 2011 , publisher=

work page 2011

[26] [26]

Artificial intelligence , volume=

Solving the multiple instance problem with axis-parallel rectangles , author=. Artificial intelligence , volume=. 1997 , publisher=

work page 1997

[27] [27]

Advances in neural information processing systems , volume=

Transmil: Transformer based correlated multiple instance learning for whole slide image classification , author=. Advances in neural information processing systems , volume=

work page

[28] [28]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Dtfd-mil: Double-tier feature distillation multiple instance learning for histopathology whole slide image classification , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page

[29] [29]

Xiong, Yunyang and Zeng, Zhanpeng and Chakraborty, Rudrasis and Tan, Mingxing and Fung, Glenn and Li, Yin and Singh, Vikas , booktitle=. Nystr

work page

[30] [30]

Frontiers in Oncology , volume=

Computational image analysis identifies histopathological image features associated with somatic mutations and patient survival in gastric adenocarcinoma , author=. Frontiers in Oncology , volume=. 2021 , publisher=

work page 2021

[31] [31]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Benchmarking self-supervised learning on diverse pathology datasets , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page

[32] [32]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page

[33] [33]

Medical image analysis , volume=

Weakly supervised histopathology cancer image segmentation and classification , author=. Medical image analysis , volume=. 2014 , publisher=

work page 2014

[34] [34]

Bioinformatics , volume=

Classifying and segmenting microscopy images with deep multiple instance learning , author=. Bioinformatics , volume=. 2016 , publisher=

work page 2016

[35] [35]

Distilling the Knowledge in a Neural Network

Distilling the knowledge in a neural network , author=. arXiv preprint arXiv:1503.02531 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[36] [36]

Medical Image Analysis , volume=

Ms-clam: Mixed supervision for the classification and localization of tumors in whole slide images , author=. Medical Image Analysis , volume=. 2023 , publisher=

work page 2023

[37] [37]

International conference on machine learning , pages=

From softmax to sparsemax: A sparse model of attention and multi-label classification , author=. International conference on machine learning , pages=. 2016 , organization=

work page 2016

[38] [38]

Journal of statistical physics , volume=

Possible generalization of Boltzmann-Gibbs statistics , author=. Journal of statistical physics , volume=. 1988 , publisher=

work page 1988

[39] [39]

Advances in neural information processing systems , volume=

Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results , author=. Advances in neural information processing systems , volume=

work page

[40] [40]

Transactions on Machine Learning Research Journal , year=

DINOv2: Learning Robust Visual Features without Supervision , author=. Transactions on Machine Learning Research Journal , year=

work page

[41] [41]

Nature medicine , volume=

Towards a general-purpose foundation model for computational pathology , author=. Nature medicine , volume=. 2024 , publisher=

work page 2024

[42] [42]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Deep residual learning for image recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

work page

[43] [43]

International journal of computer vision , volume=

Imagenet large scale visual recognition challenge , author=. International journal of computer vision , volume=. 2015 , publisher=

work page 2015

[44] [44]

International conference on medical image computing and computer-assisted intervention , pages=

Mambamil: Enhancing long sequence modeling with sequence reordering in computational pathology , author=. International conference on medical image computing and computer-assisted intervention , pages=. 2024 , organization=

work page 2024

[45] [45]

, author=

Visualizing data using t-SNE. , author=. Journal of machine learning research , volume=

work page

[46] [46]

International conference on machine learning , pages=

A simple framework for contrastive learning of visual representations , author=. International conference on machine learning , pages=. 2020 , organization=

work page 2020

[47] [47]

Advances in neural information processing systems , volume=

A framework for multiple-instance learning , author=. Advances in neural information processing systems , volume=

work page

[48] [48]

The Journal of the Acoustical Society of America , volume=

The FROC curve: A representation of the observer's performance for the method of free response , author=. The Journal of the Acoustical Society of America , volume=. 1969 , publisher=

work page 1969

[49] [49]

, author=

Free response approach to measurement and characterization of radiographic observer performance. , author=. AJR Am J Roentgenol , volume=

work page

[50] [50]

The Thirteenth International Conference on Learning Representations , year=

Rethinking multiple-instance learning from feature space to probability space , author=. The Thirteenth International Conference on Learning Representations , year=

work page

[51] [51]

Adam: A Method for Stochastic Optimization

Adam: A method for stochastic optimization , author=. arXiv preprint arXiv:1412.6980 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[52] [52]

Medical image analysis , volume=

Whole slide images based cancer survival prediction using attention guided deep multiple instance learning networks , author=. Medical image analysis , volume=. 2020 , publisher=

work page 2020

[53] [53]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Whole slide images are 2d point clouds: Context-aware survival prediction using patch-based graph convolutional networks , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2021 , organization=

work page 2021

[54] [54]

arXiv preprint arXiv:2603.06658 , year=

ASMIL: Attention-Stabilized Multiple Instance Learning for Whole Slide Imaging , author=. arXiv preprint arXiv:2603.06658 , year=

work page arXiv

[55] [55]

The Fourteenth International Conference on Learning Representations , year=

ASMIL: Attention-Stabilized Multiple Instance Learning for Whole-Slide Imaging , author=. The Fourteenth International Conference on Learning Representations , year=

work page

[56] [56]

Nature medicine , volume=

Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge , author=. Nature medicine , volume=. 2022 , publisher=

work page 2022

[57] [57]

Nature genetics , volume=

The cancer genome atlas pan-cancer analysis project , author=. Nature genetics , volume=. 2013 , publisher=

work page 2013

[58] [58]

Nature Medicine , pages=

A multimodal whole-slide foundation model for pathology , author=. Nature Medicine , pages=. 2025 , publisher=

work page 2025

[59] [59]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Interventional bag multi-instance learning on whole-slide pathological images , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page

[60] [60]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Fast and accurate gigapixel pathological image classification with hierarchical distillation multi-instance learning , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page

[61] [61]

International Conference on Learning Representations , year=

Learning Sparse Neural Networks through L\_0 Regularization , author=. International Conference on Learning Representations , year=

work page

[62] [62]

Proceedings of the international conference on learning Representations , year=

The concrete distribution: A continuous relaxation of discrete random variables , author=. Proceedings of the international conference on learning Representations , year=

work page

[63] [63]

International Conference on Learning Representations , year=

Categorical Reparameterization with Gumbel-Softmax , author=. International Conference on Learning Representations , year=

work page

[64] [64]

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

Deep inside convolutional networks: Visualising image classification models and saliency maps , author=. arXiv preprint arXiv:1312.6034 , year=

work page internal anchor Pith review Pith/arXiv arXiv

[65] [65]

International conference on machine learning , pages=

Axiomatic attribution for deep networks , author=. International conference on machine learning , pages=. 2017 , organization=

work page 2017

[66] [66]

European conference on computer vision , pages=

Visualizing and understanding convolutional networks , author=. European conference on computer vision , pages=. 2014 , organization=

work page 2014

[67] [67]

Mathematical programming , volume=

An analysis of approximations for maximizing submodular set functions—I , author=. Mathematical programming , volume=. 1978 , publisher=

work page 1978

[68] [68]

International conference on machine learning , pages=

Learning to explain: An information-theoretic perspective on model interpretation , author=. International conference on machine learning , pages=. 2018 , organization=

work page 2018

[69] [69]

International conference on learning representations , year=

INVASE: Instance-wise variable selection using neural networks , author=. International conference on learning representations , year=

work page

[70] [70]

Proceedings of the IEEE international conference on computer vision , pages=

Interpretable explanations of black boxes by meaningful perturbation , author=. Proceedings of the IEEE international conference on computer vision , pages=

work page

[71] [71]

Proceedings of the British Machine Vision Conference (BMVC) , year =

RISE: Randomized Input Sampling for Explanation of Black-box Models , author =. Proceedings of the British Machine Vision Conference (BMVC) , year =

work page

[72] [72]

Proceedings of the 58th annual meeting of the association for computational linguistics , pages=

ERASER: A benchmark to evaluate rationalized NLP models , author=. Proceedings of the 58th annual meeting of the association for computational linguistics , pages=

work page

[73] [73]

Advances in neural information processing systems , volume=

Towards robust interpretability with self-explaining neural networks , author=. Advances in neural information processing systems , volume=

work page

[74] [74]

Advances in neural information processing systems , volume=

This looks like that: deep learning for interpretable image recognition , author=. Advances in neural information processing systems , volume=

work page

[75] [75]

International conference on machine learning , pages=

Concept bottleneck models , author=. International conference on machine learning , pages=. 2020 , organization=

work page 2020

[76] [76]

International conference on machine learning , pages=

Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav) , author=. International conference on machine learning , pages=. 2018 , organization=

work page 2018

[77] [77]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Multiple instance learning framework with masked hard instance mining for whole slide image classification , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

work page

[78] [78]

1999 , publisher=

Elements of information theory , author=. 1999 , publisher=

work page 1999