A Human-in-the-Loop Framework for Efficient Prompt Selection in Microscopy Vision-Language Models

Abhiram Kandiyana; Ankur Mali; Dmitry Goldgof; Lawrence O. Hall; Peter R. Mouton

arxiv: 2605.20495 · v1 · pith:FV7XKH2Hnew · submitted 2026-05-19 · 💻 cs.CV

A Human-in-the-Loop Framework for Efficient Prompt Selection in Microscopy Vision-Language Models

Abhiram Kandiyana , Ankur Mali , Lawrence O. Hall , Peter R. Mouton , Dmitry Goldgof This is my paper

Pith reviewed 2026-05-21 07:01 UTC · model grok-4.3

classification 💻 cs.CV

keywords imagesannotationexemplarspromptselectionexpert-verifiedexpertsframework

0 comments

The pith

Targeted selection of images for expert verification lets vision-language models reach 100% accuracy with only 20 annotations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to reduce the expensive expert annotation needed for training deep learning models on microscopy images. Instead of full annotation, it uses a vision-language model to draft captions for a small number of images, which experts verify and edit lightly. These verified image-caption pairs then serve as few-shot prompts to classify all other images. To decide which images to send to experts, the authors frame the problem as active learning and test three selection criteria on small pools of unlabeled data. If effective, this keeps human experts central to the process while slashing the number of images they must handle.

Core claim

By modeling prompt-set construction as a target-driven active learning problem and applying three complementary selection criteria, the framework prioritizes unlabeled microscopy images for expert verification. This produces compact prompt sets that allow the vision-language model to classify remaining images with high accuracy using far fewer verified exemplars than random selection.

What carries the argument

Three complementary selection criteria used to prioritize which images experts should verify and edit for building the prompt set.

Load-bearing premise

The three selection criteria effectively identify images whose verification produces prompt sets that generalize well to the full dataset.

What would settle it

A direct comparison on multiple microscopy datasets where the number of images needed to reach 100% accuracy is measured for the proposed criteria versus random selection; failure occurs if the criteria do not reduce the count.

read the original abstract

Deep-learning pipelines for microscopy image classification often require expensive, labor- and time-intensive expert annotation to produce high-quality ground truth for training. Recent work has shown that prompt tuning of vision-language models (VLMs) can reduce manual annotation by constructing a small prompt set of expert-verified image-caption exemplars that is reused as few-shot context to classify all remaining images at inference time. To further reduce effort, the VLM can draft captions for candidate exemplars, which experts then verify and lightly edit instead of writing text de novo. However, two practical questions remain unaddressed: (1) which unlabeled images should be prioritized for verification, and (2) how many verified exemplars are needed to reach a performance target. In this work, we address these questions by formulating prompt-set construction as a target-driven active learning problem that prioritizes which images to annotate. We study three complementary selection criteria under strict low-resource constraints with small unlabeled pools. Experiments show that our methods reach the target performance with substantially fewer expert-verified images than random selection, achieving 100% test accuracy with as few as 20 annotated images on average. More broadly, our human-in-the-loop framework demonstrates a human-centered use of generative AI in biomedical image analysis, where experts remain actively involved in verifying and refining model output while significantly reducing annotation cost. Code and data will be publicly available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper turns prompt selection for microscopy VLMs into a target-driven active learning task and shows it can hit full accuracy with far fewer expert-verified images than random, though the small-pool results need tighter validation.

read the letter

The main point is that the authors frame the problem of picking which images to send for expert verification as an active learning task aimed at building a small prompt set for few-shot VLM classification in microscopy. They test three selection criteria under tight low-resource limits and report reaching 100% test accuracy with an average of 20 annotated images, beating random selection by a clear margin.

Referee Report

1 major / 1 minor

Summary. The manuscript presents a human-in-the-loop framework for constructing prompt sets in vision-language models for microscopy image classification. It models prompt selection as a target-driven active learning problem and evaluates three complementary selection criteria under low-resource constraints with small unlabeled pools. The central empirical claim is that these criteria allow reaching 100% test accuracy with substantially fewer expert-verified images than random selection, specifically as few as 20 annotated images on average.

Significance. If the results hold, this framework offers a practical way to reduce expert annotation effort in biomedical microscopy analysis by combining VLM-generated captions with targeted expert verification. The emphasis on human involvement while leveraging generative AI is a positive contribution to the field. The public release of code and data would further strengthen reproducibility.

major comments (1)

Experiments section: The reported result of 100% test accuracy with an average of 20 annotated images is load-bearing for the central claim, yet the manuscript provides no pool-size ablations, variance estimates across repeated samplings, or explicit comparison of how the three selection criteria perform relative to random ordering when the unlabeled pool is small. Without these, it remains unclear whether the observed advantage is due to criterion quality or chance ordering under the very low-resource constraints emphasized in the paper.

minor comments (1)

Abstract: The statement that 'Code and data will be publicly available' should include a specific repository link or DOI to support the reproducibility claim.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive and positive review of our manuscript. We address the single major comment below and will revise the paper to incorporate additional experimental details that strengthen the central claims.

read point-by-point responses

Referee: Experiments section: The reported result of 100% test accuracy with an average of 20 annotated images is load-bearing for the central claim, yet the manuscript provides no pool-size ablations, variance estimates across repeated samplings, or explicit comparison of how the three selection criteria perform relative to random ordering when the unlabeled pool is small. Without these, it remains unclear whether the observed advantage is due to criterion quality or chance ordering under the very low-resource constraints emphasized in the paper.

Authors: We agree that these details would improve the robustness of the reported results. In the revised manuscript we will add pool-size ablations that vary the size of the unlabeled pool while keeping the target performance fixed, allowing readers to see how the advantage scales. We will also report performance averaged over five independent runs with different random seeds, including standard deviations, to quantify sampling variability. Finally, we will include an explicit side-by-side comparison (new table and figure) of the three selection criteria versus random ordering specifically for small pool sizes (up to 50 images), demonstrating that the performance gap persists consistently rather than arising from a single fortunate ordering. These additions will be placed in the Experiments section and will directly address the concern about low-resource constraints. revision: yes

Circularity Check

0 steps flagged

Empirical active-learning framework with no circular derivation

full rationale

The paper formulates prompt-set construction as a target-driven active learning problem and evaluates three selection criteria through experiments on microscopy datasets, reporting that the methods reach target accuracy with fewer expert-verified images than random selection. All central claims rest on direct experimental comparisons against an external baseline rather than any mathematical derivation, fitted parameter renamed as prediction, or self-citation chain. No equations, ansatzes, or uniqueness theorems are invoked that reduce to the paper's own inputs by construction. The work is therefore self-contained against its stated benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on abstract, the framework relies on standard assumptions from active learning and VLM prompt tuning without introducing new free parameters, axioms beyond domain norms, or invented entities.

axioms (1)

domain assumption Active learning selection criteria can identify the most informative images for building effective prompt sets in low-resource settings
Central to prioritizing verification effort and achieving performance with fewer annotations.

pith-pipeline@v0.9.0 · 5795 in / 1244 out tokens · 73181 ms · 2026-05-21T07:01:11.130777+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We study three complementary selection criteria under strict low-resource constraints with small unlabeled pools... uncertainty-guided acquisition using stochastic decoding, complexity-aware uncertainty acquisition... density-tree boundary sampling
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

formulating prompt-set construction as a target-driven active learning problem

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 3 internal anchors

[1]

Accessed 2026- 02-22

tiktoken: Fast bpe tokeniser for openai models.https: //github.com/openai/tiktoken. Accessed 2026- 02-22. 7

work page 2026
[2]

Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, and Alekh Agarwal

Jordan T. Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, and Alekh Agarwal. Deep batch active learn- ing by diverse, uncertain gradient lower bounds. InInterna- tional Conference on Learning Representations, 2020. 2

work page 2020
[3]

Mar- gin based active learning

Maria-Florina Balcan, Andrei Broder, and Tong Zhang. Mar- gin based active learning. InProceedings of the 20th Annual Conference on Learning Theory, page 35–50, Berlin, Hei- delberg, 2007. Springer-Verlag. 2

work page 2007
[4]

Active prompt learning in vision language models

Jihwan Bang, Sumyeong Ahn, and Jae-Gil Lee. Active prompt learning in vision language models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pages 27004–27014, 2024. 2

work page 2024
[5]

Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers.arXiv preprint arXiv:2212.10559, 2022

Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Shuming Ma, Zhifang Sui, and Furu Wei. Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers.arXiv preprint arXiv:2212.10559, 2022. 1

work page arXiv 2022
[6]

Hall, and Peter R

Palak Dave, Yaroslav Kolinko, Hunter Morera, Kurtis Allen, Saeed Alahmari, Dmitry Goldgof, Lawrence O. Hall, and Peter R. Mouton. MIMO U-Net: efficient cell segmenta- tion and counting in microscopy image sequences. InSociety of Photo-Optical Instrumentation Engineers (SPIE) Confer- ence Series, 2023. 1

work page 2023
[7]

Active prompting with chain-of- thought for large language models

Shizhe Diao, Pengcheng Wang, Yong Lin, Rui Pan, Xiang Liu, and Tong Zhang. Active prompting with chain-of- thought for large language models. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1330–1350. As- sociation for Computational Linguistics, 2024. 2, 3, 4

work page 2024
[8]

LMMs for histopathology: zero- and few-shot patch classifi- cation with GPT and Gemini models

Caleb Heinzman, Huazhang Guo, Mai He, and Ye Duan. LMMs for histopathology: zero- and few-shot patch classifi- cation with GPT and Gemini models. InNinth International Conference on Advances in Image Processing (ICAIP 2025), page 140170T, 2026. 2, 3, 6

work page 2025
[9]

Entropy- based active learning for object recognition

Alex Holub, Pietro Perona, and Michael C Burl. Entropy- based active learning for object recognition. InIEEE com- puter society conference on computer vision and pattern recognition workshops, pages 1–8. IEEE, 2008. 2, 4

work page 2008
[10]

Mouton, Yaroslav Kolinko, Lawrence O

Abhiram Kandiyana, Peter R. Mouton, Yaroslav Kolinko, Lawrence O. Hall, and Dmitry Goldgof. Active prompt tuning enables gpt-4o to do efficient classification of mi- croscopy images. In2025 IEEE 22nd International Sym- posium on Biomedical Imaging (ISBI), pages 01–05, 2025. 2, 3, 6

work page 2025
[11]

Mind your outliers! investigat- ing the negative impact of outliers on active learning for visual question answering

Siddharth Karamcheti, Ranjay Krishna, Li Fei-Fei, and Christopher D Manning. Mind your outliers! investigat- ing the negative impact of outliers on active learning for visual question answering. InProceedings of the 59th An- nual Meeting of the Association for Computational Linguis- tics and the 11th International Joint Conference on Natural Language Proc...

work page 2021
[12]

Lewis and William A

David D. Lewis and William A. Gale. A sequential algorithm for training text classifiers. InProceedings of SIGIR, 1994. 2

work page 1994
[13]

Dual-stream multiple instance learning network for whole slide image classifica- tion with self-supervised contrastive learning

Bin Li, Yin Li, and Kevin W Eliceiri. Dual-stream multiple instance learning network for whole slide image classifica- tion with self-supervised contrastive learning. InProceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14318–14328, 2021. 1

work page 2021
[14]

Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang

Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang. Lost in the middle: How language models use long contexts.Trans- actions of the Association for Computational Linguistics, 12: 157–173, 2024. 7

work page 2024
[15]

Morera, P

H. Morera, P. Dave, S. Alahmari, Y . Kolinko, L.O. Hall, D. Goldgof, and P.R. Mouton. Mimo yolo - a multiple input multiple output model for automatic cell counting. In2023 IEEE 36th International Symposium on Computer- Based Medical Systems (CBMS), pages 827–831, 2023. 1

work page 2023
[16]

Hall, et al

Hunter Morera, Palak Dave, Yaroslav Kolinko, Saeed Alah- mari, Aidan Anderson, Grant Denham, Chloe Davis, Juan Riano, Dmitry Goldgof, Lawrence O. Hall, et al. A novel deep learning-based method for automatic stereology of mi- croglia cells from low magnification images.Neurotoxicol- ogy and Teratology, 102:107336, 2024. 1

work page 2024
[17]

Mouton.Unbiased Stereology: A Concise Guide

Peter R. Mouton.Unbiased Stereology: A Concise Guide. Johns Hopkins University Press, 2011. 1

work page 2011
[18]

John Wiley & Sons, 2014

Peter R Mouton.Neurostereology: unbiased stereology of neural systems. John Wiley & Sons, 2014. 1

work page 2014
[19]

Daisuke Ono, Dennis W Dickson, and Shunsuke Koga. Eval- uating the efficacy of few-shot learning for GPT-4Vision in neurodegenerative disease histopathology: A comparative analysis with convolutional neural network model.Neu- ropathol Appl Neurobiol, 50(4):e12997, 2024. 2, 3, 6

work page 2024
[20]

Learning transferable visual models from natural language supervi- sion

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervi- sion. InInternational conference on machine learning, pages 8748–8763. PMLR, 2021. 8

work page 2021
[21]

Residual prompt tuning: Improving prompt tuning with residual reparameterization

Anastasiia Razdaibiedina, Yuning Mao, Madian Khabsa, Mike Lewis, Rui Hou, Jimmy Ba, and Amjad Almahairi. Residual prompt tuning: Improving prompt tuning with residual reparameterization. InFindings of the Associa- tion for Computational Linguistics: ACL 2023, pages 6740– 6757, 2023. 2

work page 2023
[22]

Active learning for vision- language models

Bardia Safaei and Vishal M Patel. Active learning for vision- language models. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 4902–4912. IEEE, 2025. 2, 5

work page 2025
[23]

MedGemma Technical Report

Andrew Sellergren, Sahar Kazemzadeh, Tiam Jaroen- sri, Atilla Kiraly, Madeleine Traverse, Timo Kohlberger, Shawn Xu, Fayaz Jamil, C ´ıan Hughes, Charles Lau, et al. Medgemma technical report.arXiv preprint arXiv:2507.05201, 2025. 8

work page internal anchor Pith review Pith/arXiv arXiv 2025
[24]

Active learning for convolu- tional neural networks: A core-set approach

Ozan Sener and Silvio Savarese. Active learning for convolu- tional neural networks: A core-set approach. InInternational Conference on Learning Representations, 2018. 2, 5

work page 2018
[25]

Active learning literature survey

Burr Settles. Active learning literature survey. Technical Report 1648, University of Wisconsin–Madison, 2009. 2

work page 2009
[26]

Silverman.Density Estimation for Statistics and Data Analysis

Bernard W. Silverman.Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC, 1 edition, 1998. 5

work page 1998
[27]

Support vector machine ac- tive learning with applications to text classification.Journal of Machine Learning Research, 2:45–66, 2001

Simon Tong and Daphne Koller. Support vector machine ac- tive learning with applications to text classification.Journal of Machine Learning Research, 2:45–66, 2001. 2

work page 2001
[28]

Minilmv2: Multi-head self-attention relation dis- tillation for compressing pretrained transformers

Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, and Furu Wei. Minilmv2: Multi-head self-attention relation dis- tillation for compressing pretrained transformers. InFind- ings of the Association for Computational Linguistics: ACL- IJCNLP 2021, pages 2140–2151, 2021. 7

work page 2021
[29]

Self-Consistency Improves Chain of Thought Reasoning in Language Models

Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, and Denny Zhou. Self-consistency improves chain of thought reason- ing in language models.arXiv preprint arXiv:2203.11171,

work page internal anchor Pith review Pith/arXiv arXiv
[30]

Chain-of-thought prompting elicits reasoning in large language models

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc V Le, and Denny Zhou. Chain-of-thought prompting elicits reasoning in large language models. InAdvances in Neural Information Processing Systems, pages 24824–24837. Curran Associates, Inc., 2022. 3

work page 2022
[31]

BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs

Sheng Zhang, Yanbo Xu, Naoto Usuyama, Hanwen Xu, Jaspreet Bagga, Robert Tinn, Sam Preston, Rajesh Rao, Mu Wei, Naveen Valluri, et al. Biomedclip: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs.arXiv preprint arXiv:2303.00915,

work page internal anchor Pith review Pith/arXiv arXiv
[32]

What makes good examples for visual in-context learning?Advances in Neural Information Processing Systems, 36, 2024

Yuanhan Zhang, Kaiyang Zhou, and Ziwei Liu. What makes good examples for visual in-context learning?Advances in Neural Information Processing Systems, 36, 2024. 2

work page 2024
[33]

Conditional prompt learning for vision-language mod- els

Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. Conditional prompt learning for vision-language mod- els. InProceedings of the IEEE/CVF conference on com- puter vision and pattern recognition, pages 16816–16825,

work page
[34]

Learning to prompt for vision-language models.In- ternational journal of computer vision, 130(9):2337–2348,

Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. Learning to prompt for vision-language models.In- ternational journal of computer vision, 130(9):2337–2348,

work page

[1] [1]

Accessed 2026- 02-22

tiktoken: Fast bpe tokeniser for openai models.https: //github.com/openai/tiktoken. Accessed 2026- 02-22. 7

work page 2026

[2] [2]

Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, and Alekh Agarwal

Jordan T. Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, and Alekh Agarwal. Deep batch active learn- ing by diverse, uncertain gradient lower bounds. InInterna- tional Conference on Learning Representations, 2020. 2

work page 2020

[3] [3]

Mar- gin based active learning

Maria-Florina Balcan, Andrei Broder, and Tong Zhang. Mar- gin based active learning. InProceedings of the 20th Annual Conference on Learning Theory, page 35–50, Berlin, Hei- delberg, 2007. Springer-Verlag. 2

work page 2007

[4] [4]

Active prompt learning in vision language models

Jihwan Bang, Sumyeong Ahn, and Jae-Gil Lee. Active prompt learning in vision language models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pages 27004–27014, 2024. 2

work page 2024

[5] [5]

Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers.arXiv preprint arXiv:2212.10559, 2022

Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Shuming Ma, Zhifang Sui, and Furu Wei. Why can gpt learn in-context? language models implicitly perform gradient descent as meta-optimizers.arXiv preprint arXiv:2212.10559, 2022. 1

work page arXiv 2022

[6] [6]

Hall, and Peter R

Palak Dave, Yaroslav Kolinko, Hunter Morera, Kurtis Allen, Saeed Alahmari, Dmitry Goldgof, Lawrence O. Hall, and Peter R. Mouton. MIMO U-Net: efficient cell segmenta- tion and counting in microscopy image sequences. InSociety of Photo-Optical Instrumentation Engineers (SPIE) Confer- ence Series, 2023. 1

work page 2023

[7] [7]

Active prompting with chain-of- thought for large language models

Shizhe Diao, Pengcheng Wang, Yong Lin, Rui Pan, Xiang Liu, and Tong Zhang. Active prompting with chain-of- thought for large language models. InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1330–1350. As- sociation for Computational Linguistics, 2024. 2, 3, 4

work page 2024

[8] [8]

LMMs for histopathology: zero- and few-shot patch classifi- cation with GPT and Gemini models

Caleb Heinzman, Huazhang Guo, Mai He, and Ye Duan. LMMs for histopathology: zero- and few-shot patch classifi- cation with GPT and Gemini models. InNinth International Conference on Advances in Image Processing (ICAIP 2025), page 140170T, 2026. 2, 3, 6

work page 2025

[9] [9]

Entropy- based active learning for object recognition

Alex Holub, Pietro Perona, and Michael C Burl. Entropy- based active learning for object recognition. InIEEE com- puter society conference on computer vision and pattern recognition workshops, pages 1–8. IEEE, 2008. 2, 4

work page 2008

[10] [10]

Mouton, Yaroslav Kolinko, Lawrence O

Abhiram Kandiyana, Peter R. Mouton, Yaroslav Kolinko, Lawrence O. Hall, and Dmitry Goldgof. Active prompt tuning enables gpt-4o to do efficient classification of mi- croscopy images. In2025 IEEE 22nd International Sym- posium on Biomedical Imaging (ISBI), pages 01–05, 2025. 2, 3, 6

work page 2025

[11] [11]

Mind your outliers! investigat- ing the negative impact of outliers on active learning for visual question answering

Siddharth Karamcheti, Ranjay Krishna, Li Fei-Fei, and Christopher D Manning. Mind your outliers! investigat- ing the negative impact of outliers on active learning for visual question answering. InProceedings of the 59th An- nual Meeting of the Association for Computational Linguis- tics and the 11th International Joint Conference on Natural Language Proc...

work page 2021

[12] [12]

Lewis and William A

David D. Lewis and William A. Gale. A sequential algorithm for training text classifiers. InProceedings of SIGIR, 1994. 2

work page 1994

[13] [13]

Dual-stream multiple instance learning network for whole slide image classifica- tion with self-supervised contrastive learning

Bin Li, Yin Li, and Kevin W Eliceiri. Dual-stream multiple instance learning network for whole slide image classifica- tion with self-supervised contrastive learning. InProceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14318–14328, 2021. 1

work page 2021

[14] [14]

Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang

Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang. Lost in the middle: How language models use long contexts.Trans- actions of the Association for Computational Linguistics, 12: 157–173, 2024. 7

work page 2024

[15] [15]

Morera, P

H. Morera, P. Dave, S. Alahmari, Y . Kolinko, L.O. Hall, D. Goldgof, and P.R. Mouton. Mimo yolo - a multiple input multiple output model for automatic cell counting. In2023 IEEE 36th International Symposium on Computer- Based Medical Systems (CBMS), pages 827–831, 2023. 1

work page 2023

[16] [16]

Hall, et al

Hunter Morera, Palak Dave, Yaroslav Kolinko, Saeed Alah- mari, Aidan Anderson, Grant Denham, Chloe Davis, Juan Riano, Dmitry Goldgof, Lawrence O. Hall, et al. A novel deep learning-based method for automatic stereology of mi- croglia cells from low magnification images.Neurotoxicol- ogy and Teratology, 102:107336, 2024. 1

work page 2024

[17] [17]

Mouton.Unbiased Stereology: A Concise Guide

Peter R. Mouton.Unbiased Stereology: A Concise Guide. Johns Hopkins University Press, 2011. 1

work page 2011

[18] [18]

John Wiley & Sons, 2014

Peter R Mouton.Neurostereology: unbiased stereology of neural systems. John Wiley & Sons, 2014. 1

work page 2014

[19] [19]

Daisuke Ono, Dennis W Dickson, and Shunsuke Koga. Eval- uating the efficacy of few-shot learning for GPT-4Vision in neurodegenerative disease histopathology: A comparative analysis with convolutional neural network model.Neu- ropathol Appl Neurobiol, 50(4):e12997, 2024. 2, 3, 6

work page 2024

[20] [20]

Learning transferable visual models from natural language supervi- sion

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervi- sion. InInternational conference on machine learning, pages 8748–8763. PMLR, 2021. 8

work page 2021

[21] [21]

Residual prompt tuning: Improving prompt tuning with residual reparameterization

Anastasiia Razdaibiedina, Yuning Mao, Madian Khabsa, Mike Lewis, Rui Hou, Jimmy Ba, and Amjad Almahairi. Residual prompt tuning: Improving prompt tuning with residual reparameterization. InFindings of the Associa- tion for Computational Linguistics: ACL 2023, pages 6740– 6757, 2023. 2

work page 2023

[22] [22]

Active learning for vision- language models

Bardia Safaei and Vishal M Patel. Active learning for vision- language models. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 4902–4912. IEEE, 2025. 2, 5

work page 2025

[23] [23]

MedGemma Technical Report

Andrew Sellergren, Sahar Kazemzadeh, Tiam Jaroen- sri, Atilla Kiraly, Madeleine Traverse, Timo Kohlberger, Shawn Xu, Fayaz Jamil, C ´ıan Hughes, Charles Lau, et al. Medgemma technical report.arXiv preprint arXiv:2507.05201, 2025. 8

work page internal anchor Pith review Pith/arXiv arXiv 2025

[24] [24]

Active learning for convolu- tional neural networks: A core-set approach

Ozan Sener and Silvio Savarese. Active learning for convolu- tional neural networks: A core-set approach. InInternational Conference on Learning Representations, 2018. 2, 5

work page 2018

[25] [25]

Active learning literature survey

Burr Settles. Active learning literature survey. Technical Report 1648, University of Wisconsin–Madison, 2009. 2

work page 2009

[26] [26]

Silverman.Density Estimation for Statistics and Data Analysis

Bernard W. Silverman.Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC, 1 edition, 1998. 5

work page 1998

[27] [27]

Support vector machine ac- tive learning with applications to text classification.Journal of Machine Learning Research, 2:45–66, 2001

Simon Tong and Daphne Koller. Support vector machine ac- tive learning with applications to text classification.Journal of Machine Learning Research, 2:45–66, 2001. 2

work page 2001

[28] [28]

Minilmv2: Multi-head self-attention relation dis- tillation for compressing pretrained transformers

Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, and Furu Wei. Minilmv2: Multi-head self-attention relation dis- tillation for compressing pretrained transformers. InFind- ings of the Association for Computational Linguistics: ACL- IJCNLP 2021, pages 2140–2151, 2021. 7

work page 2021

[29] [29]

Self-Consistency Improves Chain of Thought Reasoning in Language Models

Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, and Denny Zhou. Self-consistency improves chain of thought reason- ing in language models.arXiv preprint arXiv:2203.11171,

work page internal anchor Pith review Pith/arXiv arXiv

[30] [30]

Chain-of-thought prompting elicits reasoning in large language models

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc V Le, and Denny Zhou. Chain-of-thought prompting elicits reasoning in large language models. InAdvances in Neural Information Processing Systems, pages 24824–24837. Curran Associates, Inc., 2022. 3

work page 2022

[31] [31]

BiomedCLIP: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs

Sheng Zhang, Yanbo Xu, Naoto Usuyama, Hanwen Xu, Jaspreet Bagga, Robert Tinn, Sam Preston, Rajesh Rao, Mu Wei, Naveen Valluri, et al. Biomedclip: a multimodal biomedical foundation model pretrained from fifteen million scientific image-text pairs.arXiv preprint arXiv:2303.00915,

work page internal anchor Pith review Pith/arXiv arXiv

[32] [32]

What makes good examples for visual in-context learning?Advances in Neural Information Processing Systems, 36, 2024

Yuanhan Zhang, Kaiyang Zhou, and Ziwei Liu. What makes good examples for visual in-context learning?Advances in Neural Information Processing Systems, 36, 2024. 2

work page 2024

[33] [33]

Conditional prompt learning for vision-language mod- els

Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. Conditional prompt learning for vision-language mod- els. InProceedings of the IEEE/CVF conference on com- puter vision and pattern recognition, pages 16816–16825,

work page

[34] [34]

Learning to prompt for vision-language models.In- ternational journal of computer vision, 130(9):2337–2348,

Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. Learning to prompt for vision-language models.In- ternational journal of computer vision, 130(9):2337–2348,

work page