arxiv: 2604.20256 · v1 · submitted 2026-04-22 · 💻 cs.CL · cs.LG

Recognition: unknown

RADS: Reinforcement Learning-Based Sample Selection Improves Transfer Learning in Low-resource and Imbalanced Clinical Settings

Wei Han , David Martinez , Anna Khanina , Lawrence Cavedon , Karin Verspoor

Authors on Pith no claims yet

Pith reviewed 2026-05-10 00:34 UTC · model grok-4.3

classification 💻 cs.CL cs.LG

keywords reinforcement learningsample selectiontransfer learningclinical NLPclass imbalancelow-resource settingsactive learning

0 comments

The pith

Reinforcement learning selects more useful samples than standard active learning for clinical transfer learning under scarcity and imbalance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces RADS, a reinforcement learning approach to pick training samples for fine-tuning models when moving to a new clinical domain with very few examples and skewed class distributions. Traditional methods like uncertainty sampling often choose outliers in these conditions, which harms the final model. By training an agent to maximize performance after selection, RADS aims to identify truly informative instances. This is relevant for medical applications where labeled data is expensive to obtain and imbalance is common, as it could make transfer learning more effective without needing more annotations.

Core claim

RADS is a sample selection strategy that uses reinforcement learning to adaptively choose the most informative samples from a small, imbalanced pool for domain adaptation in clinical natural language processing tasks. Experiments on real-world clinical datasets demonstrate that this leads to enhanced model transferability and better handling of extreme class imbalance compared to conventional active learning techniques such as uncertainty and diversity sampling.

What carries the argument

RADS, an RL-based adaptive sampler that learns a policy to select samples maximizing post-fine-tuning performance in the target clinical domain.

If this is right

Selected samples lead to higher performance metrics on downstream clinical tasks like classification.
The strategy remains effective even with extreme class imbalance in the available data.
Transfer learning becomes more reliable in low-resource clinical settings without additional data collection.
Outperforms uncertainty sampling and diversity sampling baselines on multiple datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This method could be tested on non-clinical low-resource NLP tasks to check broader applicability.
Different reward functions for the RL agent might further improve selection quality in future iterations.
Combining RADS with other transfer learning techniques like data augmentation could yield additional gains.

Load-bearing premise

The reward signal used to train the reinforcement learning agent can guide it toward genuinely informative samples instead of being overwhelmed by the class imbalance or outliers in the tiny initial dataset.

What would settle it

If on a new clinical dataset with low resources and high imbalance, models fine-tuned on RADS-selected samples show no improvement or worse results than those using uncertainty sampling, as measured by standard metrics like F1 score.

Figures

Figures reproduced from arXiv: 2604.20256 by Anna Khanina, David Martinez, Karin Verspoor, Lawrence Cavedon, Wei Han.

**Figure 2.** Figure 2: Our approach consists of three stages: (1) we train an active learner on Ds and compute informativeness signals for Ut via Monte-Carlo (MC) dropout; (2) we define a prior-aware utility that combines BALD-based mutual information (Houlsby et al., 2011) with pseudo-label class weighting to explicitly control the quality of selected samples for transfer learning under severe class imbalance; and (3) we tra… view at source ↗

**Figure 3.** Figure 3: Word clouds for the CHIFIR (left), PIFIR [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Transfer from CHIFIR to PIFIR under base [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Transfer from CHIFIR to PIFIR under our method RADS. formance from CHIFIR to PIFIR with our method. The left graph shows that starting from budget 5, F1 on PIFIR stays around 0.85–0.87 and additional labels provide marginal improvements, while CHIFIR performance remains stable. The right graph plots the domain gap ∆F1 versus budget with 95% confidence intervals. From budget = 4 onward, the gap is effectiv… view at source ↗

**Figure 7.** Figure 7: Transfer learning from CHIFIR to PIFIR (left) [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Number of selected PIFIR samples versus the [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 9.** Figure 9: Concept-level KL divergence from CHIFIR to PIFIR. [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗

**Figure 10.** Figure 10: Class imbalance analysis of positive to nega [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗

**Figure 11.** Figure 11: Jaccard similarity heatmap between CHIFIR and PIFIR concepts. [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗

**Figure 12.** Figure 12: Transfer from MIMIC-CXR to PIFIR under baselines BatchBALD (left) and TAGCOS (right) [PITH_FULL_IMAGE:figures/full_fig_p015_12.png] view at source ↗

**Figure 13.** Figure 13: Transfer from MIMIC-CXR to PIFIR under our method RADS. cells, fluid, bronchial, biopsy, tissue, specimen), reflecting cytology/histopathology reporting that emphasizes sample type and microscopic description rather than imaging observations. PIFIR is characterized by PET-CT and metabolic-imaging terminology (e.g., uptake, FDG, PET, CT, activity), as well as systemic disease descriptors (e.g., marrow,… view at source ↗

read the original abstract

A common strategy in transfer learning is few shot fine-tuning, but its success is highly dependent on the quality of samples selected as training examples. Active learning methods such as uncertainty sampling and diversity sampling can select useful samples. However, under extremely low-resource and class-imbalanced conditions, they often favor outliers rather than truly informative samples, resulting in degraded performance. In this paper, we introduce RADS (Reinforcement Adaptive Domain Sampling), a robust sample selection strategy using reinforcement learning (RL) to identify the most informative samples. Experimental evaluations on several real world clinical datasets show our sample selection strategy enhances model transferability while maintaining robust performance under extreme class imbalance compared to traditional methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper proposes RADS (Reinforcement Adaptive Domain Sampling), an RL-based sample selection method for improving transfer learning in low-resource, class-imbalanced clinical NLP settings. It claims that uncertainty and diversity sampling often select outliers under extreme imbalance, while RADS learns a policy to identify more informative samples, yielding better transferability and robustness on real clinical datasets.

Significance. If the empirical claims hold with proper controls, RADS could address a practical bottleneck in clinical transfer learning by providing a more stable alternative to standard active learning heuristics when labeled data is both scarce and skewed.

major comments (2)

[Abstract] Abstract: the central claim that 'experimental evaluations on several real world clinical datasets show our sample selection strategy enhances model transferability while maintaining robust performance under extreme class imbalance' is asserted without any metrics, baselines, dataset sizes, imbalance ratios, or statistical tests, so the claim cannot be evaluated.
[Method] Method (RL component): no description is given of the reward function, state representation, or any balancing/diversity term that would prevent the policy gradient from being dominated by majority-class performance or outlier gradients in tiny initial pools; this is load-bearing for the claim that RL avoids the failure modes of uncertainty/diversity sampling.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below and have made revisions to strengthen the paper where the comments identify areas for improvement.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'experimental evaluations on several real world clinical datasets show our sample selection strategy enhances model transferability while maintaining robust performance under extreme class imbalance' is asserted without any metrics, baselines, dataset sizes, imbalance ratios, or statistical tests, so the claim cannot be evaluated.

Authors: We agree that the original abstract is too high-level and does not allow readers to immediately assess the strength of the central claim. In the revised manuscript we have updated the abstract to include concrete details: specific performance metrics (F1 improvements over baselines), the clinical datasets used along with their sizes and imbalance ratios, and a brief reference to the statistical tests performed. This revision makes the claim directly evaluable from the abstract while remaining concise. revision: yes
Referee: [Method] Method (RL component): no description is given of the reward function, state representation, or any balancing/diversity term that would prevent the policy gradient from being dominated by majority-class performance or outlier gradients in tiny initial pools; this is load-bearing for the claim that RL avoids the failure modes of uncertainty/diversity sampling.

Authors: The referee is correct that a precise description of the RL components is necessary to substantiate the paper's claims. The initial submission's Method section did not provide sufficient detail on these elements. We have substantially expanded the Method section to explicitly define (1) the state representation (model embeddings combined with a summary of the labeled pool's class distribution), (2) the reward function (improvement in macro-F1 on a held-out validation set plus a diversity penalty term), and (3) an explicit balancing mechanism within the policy objective that down-weights majority-class gradients and outlier influence. These additions directly explain how the learned policy mitigates the outlier and imbalance problems observed with uncertainty and diversity sampling. revision: yes

Circularity Check

0 steps flagged

No significant circularity; method is an application of standard RL without self-referential reductions

full rationale

The paper introduces RADS as a reinforcement learning strategy for sample selection in low-resource clinical transfer learning. No equations, fitted parameters, or derivations are presented that reduce by construction to the inputs (e.g., no self-definitional reward functions or predictions that are statistically forced from the same data). The central claims rest on experimental evaluations on external real-world datasets rather than on any load-bearing self-citation chain, uniqueness theorem from the authors, or ansatz smuggled via prior work. This is a standard methodological contribution with independent empirical content; no circular steps are identifiable from the provided text.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the method is described at a high level only.

pith-pipeline@v0.9.0 · 5419 in / 982 out tokens · 30750 ms · 2026-05-10T00:34:34.094623+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

69 extracted references · 13 canonical work pages · 1 internal anchor

[1]

Proceedings of Second Doctoral Symposium on Computational Intelligence: DoSCI 2021 , pages=

Named entity recognition in natural language processing: A systematic review , author=. Proceedings of Second Doctoral Symposium on Computational Intelligence: DoSCI 2021 , pages=. 2022 , organization=

2021
[2]

Journal of Computational and Cognitive Engineering , volume=

Comparing BERT against traditional machine learning models in text classification , author=. Journal of Computational and Cognitive Engineering , volume=
[3]

Retrieving and reading: A comprehensive survey on open-domain question answering

Retrieving and reading: A comprehensive survey on open-domain question answering , author=. arXiv preprint arXiv:2101.00774 , year=

work page arXiv
[4]

LLaMA: Open and Efficient Foundation Language Models

Llama: Open and efficient foundation language models , author=. arXiv preprint arXiv:2302.13971 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[5]

Liu, Aixin and Feng, Bei and Xue, Bing and Wang, Bingxuan and Wu, Bochao and Lu, Chengda and Zhao, Chenggang and Deng, Chengqi and Zhang, Chenyu and Ruan, Chong and others , journal=
[6]

Journal of Big Data , volume=

A survey of transfer learning , author=. Journal of Big Data , volume=. 2016 , publisher=

2016
[7]

2009 , publisher=

Neofytos, Dionissios and Horn, D and Anaissie, E and Steinbach, W and Olyaei, A and Fishman, J and Pfaller, M and Chang, C and Webster, K and Marr, K , journal=. 2009 , publisher=

2009
[8]

2014 , publisher=

Girmenia, Corrado and Raiola, Anna Maria and Piciocchi, Alfonso and Algarotti, Alessandra and Stanzani, Marta and Cudillo, Laura and Pecoraro, Clara and Guidi, Stefano and Iori, Anna Paola and Montante, Barbara and others , journal=. 2014 , publisher=

2014
[9]

Clinical Infectious Diseases , volume=

Prospective surveillance for invasive fungal infections in hematopoietic stem cell transplant recipients, 2001--2006: overview of the Transplant-Associated Infection Surveillance Network (TRANSNET) Database , author=. Clinical Infectious Diseases , volume=. 2010 , publisher=

2001
[10]

Journal of Hospital Infection , volume=

Advances in electronic surveillance for healthcare-associated infections in the 21st Century: a systematic review , author=. Journal of Hospital Infection , volume=. 2013 , publisher=

2013
[11]

Journal of Biomedical Informatics , volume=

Detecting evidence of invasive fungal infections in cytology and histopathology reports enriched with concept-level annotations , author=. Journal of Biomedical Informatics , volume=. 2023 , publisher=

2023
[12]

2015 , publisher=

Martinez, David and Ananda-Rajah, Michelle R and Suominen, Hanna and Slavin, Monica A and Thursky, Karin A and Cavedon, Lawrence , journal=. 2015 , publisher=

2015
[13]

2008 Fourth international conference on natural computation , volume=

On the class imbalance problem , author=. 2008 Fourth international conference on natural computation , volume=. 2008 , organization=

2008
[14]

Terminology , volume=

Terminology in medical reports: Textual parameters and their lexical indicators that hinder patient understanding , author=. Terminology , volume=. 2020 , publisher=

2020
[15]

2021 , publisher=

Cury, Ricardo C and Megyeri, Istvan and Lindsey, Tony and Macedo, Robson and Batlle, Juan and Kim, Shwan and Baker, Brian and Harris, Robert and Clark, Reese H , journal=. 2021 , publisher=

2021
[16]

Computers in Biology and Medicine , volume=

L. Computers in Biology and Medicine , volume=. 2020 , publisher=

2020
[17]

2020 , publisher=

Lee, Jinhyuk and Yoon, Wonjin and Kim, Sungdong and Kim, Donghyeon and Kim, Sunkyu and So, Chan Ho and Kang, Jaewoo , journal=. 2020 , publisher=

2020
[18]

Huang, Kexin and Altosaar, Jaan and Ranganath, Rajesh , journal=
[19]

International Congress on Information and Communication Technology , pages=

Epidemic Information Extraction for Event-Based Surveillance Using Large Language Models , author=. International Congress on Information and Communication Technology , pages=. 2024 , organization=

2024
[20]

BMC Medical Informatics and Decision Making , volume=

Leveraging machine learning approaches for predicting potential Lyme disease cases and incidence rates in the United States using Twitter , author=. BMC Medical Informatics and Decision Making , volume=. 2023 , publisher=

2023
[21]

PPT : Pre-trained Prompt Tuning for Few-shot Learning

Gu, Yuxian and Han, Xu and Liu, Zhiyuan and Huang, Minlie. PPT : Pre-trained Prompt Tuning for Few-shot Learning. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022. doi:10.18653/v1/2022.acl-long.576

work page doi:10.18653/v1/2022.acl-long.576 2022
[22]

Advances in Neural Information Processing Systems , volume=

Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning , author=. Advances in Neural Information Processing Systems , volume=
[23]

Machine Learning , volume=

How to measure uncertainty in uncertainty sampling for active learning , author=. Machine Learning , volume=. 2022 , publisher=

2022
[24]

International Journal of Computer Vision , volume=

Multi-class active learning by uncertainty sampling with diversity maximization , author=. International Journal of Computer Vision , volume=. 2015 , publisher=

2015
[25]

Learning how to Active Learn: A Deep Reinforcement Learning Approach

Fang, Meng and Li, Yuan and Cohn, Trevor. Learning how to Active Learn: A Deep Reinforcement Learning Approach. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. doi:10.18653/v1/D17-1063

work page doi:10.18653/v1/d17-1063 2017
[26]

Journal of the American Medical Informatics Association , volume=

A review of reinforcement learning for natural language processing and applications in healthcare , author=. Journal of the American Medical Informatics Association , volume=. 2024 , publisher=

2024
[27]

Ai Open , volume=

Data augmentation approaches in natural language processing: A survey , author=. Ai Open , volume=. 2022 , publisher=

2022
[28]

Journal of Big Data , volume=

Text data augmentation for deep learning , author=. Journal of Big Data , volume=. 2021 , publisher=

2021
[29]

ACM Computing Surveys , volume=

A survey on data augmentation for text classification , author=. ACM Computing Surveys , volume=. 2022 , publisher=

2022
[30]

Studies in Health Technology and Informatics , volume=

Automated Detection of Invasive Fungal Infections in Clinical Reports Using Medical Language Models , author=. Studies in Health Technology and Informatics , volume=
[31]

ACM computing surveys (CSUR) , volume=

A survey of deep active learning , author=. ACM computing surveys (CSUR) , volume=. 2021 , publisher=

2021
[32]

International Journal of Machine Learning and Computing , volume=

Addressing the class imbalance problem in medical datasets , author=. International Journal of Machine Learning and Computing , volume=. 2013 , publisher=

2013
[33]

Machine Learning , volume=

The class imbalance problem in deep learning , author=. Machine Learning , volume=. 2024 , publisher=

2024
[34]

JOIV: International Journal on Informatics Visualization , volume=

Addressing Class Imbalance of Health Data: A Systematic Literature Review on Modified Synthetic Minority Oversampling Technique (SMOTE) Strategies , author=. JOIV: International Journal on Informatics Visualization , volume=
[35]

Frontiers in digital health , volume=

A review on over-sampling techniques in classification of multi-class imbalanced datasets: Insights for medical problems , author=. Frontiers in digital health , volume=. 2024 , publisher=

2024
[36]

Journal of Big Data , volume=

Impact of random oversampling and random undersampling on the performance of prediction models developed using observational health data , author=. Journal of Big Data , volume=. 2024 , publisher=

2024
[37]

Artificial Intelligence Review , volume=

Cost-sensitive learning for imbalanced medical data: a review , author=. Artificial Intelligence Review , volume=. 2024 , publisher=

2024
[38]

Journal of Information Science , volume=

A novel focal-loss and class-weight-aware convolutional neural network for the classification of in-text citations , author=. Journal of Information Science , volume=. 2023 , publisher=

2023
[39]

Language Models are Few-Shot Learners , url =

Brown, Tom and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and Kaplan, Jared D and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and Agarwal, Sandhini and Herbert-Voss, Ariel and Krueger, Gretchen and Henighan, Tom and Child, Rewon and Ramesh, Aditya and Ziegler, Daniel and Wu, Jeffrey and Winte...
[40]

Active Learning by Acquiring Contrastive Examples

Margatina, Katerina and Vernikos, Giorgos and Barrault, Lo. Active Learning by Acquiring Contrastive Examples. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021. doi:10.18653/v1/2021.emnlp-main.51

work page doi:10.18653/v1/2021.emnlp-main.51 2021
[41]

A survey on deep transfer learning , author=. Artificial Neural Networks and Machine Learning--ICANN 2018: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4-7, 2018, Proceedings, Part III 27 , pages=. 2018 , organization=

2018
[42]

Tsinghua Science and Technology , volume=

Enriching the transfer learning with pre-trained lexicon embedding for low-resource neural machine translation , author=. Tsinghua Science and Technology , volume=. 2021 , publisher=

2021
[43]

arXiv preprint arXiv:2007.04239 , year=

A survey on transfer learning in natural language processing , author=. arXiv preprint arXiv:2007.04239 , year=

work page arXiv 2007
[44]

Health informatics journal , volume=

The class imbalance problem detecting adverse drug reactions in electronic health records , author=. Health informatics journal , volume=. 2019 , publisher=

2019
[45]

Journal of Big Data , volume=

Survey on deep learning with class imbalance , author=. Journal of Big Data , volume=. 2019 , publisher=

2019
[46]

Proceedings of the 40th annual meeting of the Association for Computational Linguistics , pages=

Bleu: a method for automatic evaluation of machine translation , author=. Proceedings of the 40th annual meeting of the Association for Computational Linguistics , pages=
[47]

2022 , publisher=

Liu, Jinghui and Capurro, Daniel and Nguyen, Anthony and Verspoor, Karin , journal=. 2022 , publisher=

2022
[48]

Hester, Todd and Vecerik, Matej and Pietquin, Olivier and Lanctot, Marc and Schaul, Tom and Piot, Bilal and Horgan, Dan and Quan, John and Sendonaris, Andrew and Osband, Ian and others , booktitle=. Deep
[49]

Proceedings of the international multiconference of engineers and computer scientists , volume=

Using of Jaccard coefficient for keywords similarity , author=. Proceedings of the international multiconference of engineers and computer scientists , volume=
[50]

2004 , publisher=

Townsend, David W and Carney, Jonathan PJ and Yap, Jeffrey T and Hall, Nathan C , journal=. 2004 , publisher=

2004
[51]

Current Fungal Infection Reports , volume=

Histopathology in the diagnosis of invasive fungal diseases , author=. Current Fungal Infection Reports , volume=. 2021 , publisher=

2021
[52]

Nature , volume=

Human-level control through deep reinforcement learning , author=. Nature , volume=. 2015 , publisher=

2015
[53]

Effective Demonstration Annotation for In-Context Learning via Language Model-Based Determinantal Point Process

Wang, Peng and Wang, Xiaobin and Lou, Chao and Mao, Shengyu and Xie, Pengjun and Jiang, Yong. Effective Demonstration Annotation for In-Context Learning via Language Model-Based Determinantal Point Process. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. 2024. doi:10.18653/v1/2024.emnlp-main.74

work page doi:10.18653/v1/2024.emnlp-main.74 2024
[54]

The Annals of Mathematical Statistics , volume=

On information and sufficiency , author=. The Annals of Mathematical Statistics , volume=. 1951 , publisher=

1951
[55]

Evaluating the Utility of Model Configurations and Data Augmentation on Clinical Semantic Textual Similarity

Wang, Yuxia and Liu, Fei and Verspoor, Karin and Baldwin, Timothy. Evaluating the Utility of Model Configurations and Data Augmentation on Clinical Semantic Textual Similarity. Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing. 2020. doi:10.18653/v1/2020.bionlp-1.11

work page doi:10.18653/v1/2020.bionlp-1.11 2020
[56]

Knowledge and information systems , volume=

A survey on instance selection for active learning , author=. Knowledge and information systems , volume=. 2013 , publisher=

2013
[57]

Cold-start Active Learning through Self-supervised Language Modeling

Yuan, Michelle and Lin, Hsuan-Tien and Boyd-Graber, Jordan. Cold-start Active Learning through Self-supervised Language Modeling. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. doi:10.18653/v1/2020.emnlp-main.637

work page doi:10.18653/v1/2020.emnlp-main.637 2020
[58]

2025 , month = feb, note =

Rozova, Vlada and Khanina, Anna and Ong, Jeremy and Alipour, Ramin and Worth, Leon and Slavin, Monica and Thursky, Karin and Verspoor, Karin , title =. 2025 , month = feb, note =. doi:10.13026/d51v-j343 , url =

work page doi:10.13026/d51v-j343 2025
[59]

2023 , month = jul, note =

Rozova, Vlada and Khanina, Anna and Teng, Jasmine and Teh, Joanne and Worth, Leon and Slavin, Monica and thursky, karin and Verspoor, Karin , title =. 2023 , month = jul, note =. doi:10.13026/fmj9-p237 , url =

work page doi:10.13026/fmj9-p237 2023
[60]

Information systems frontiers , pages=

Comparing and Improving Active Learning Uncertainty Measures for Transformer Models by Discarding Outliers , author=. Information systems frontiers , pages=. 2024 , publisher=

2024
[61]

international conference on machine learning , pages=

Dropout as a bayesian approximation: Representing model uncertainty in deep learning , author=. international conference on machine learning , pages=. 2016 , organization=

2016
[62]

Bayesian active learning for classiﬁcation and preferenc e learning,

Bayesian active learning for classification and preference learning , author=. arXiv preprint arXiv:1112.5745 , year=

work page arXiv
[63]

2019 , publisher=

Johnson, Alistair EW and Pollard, Tom J and Berkowitz, Seth J and Greenbaum, Nathaniel R and Lungren, Matthew P and Deng, Chih-ying and Mark, Roger G and Horng, Steven , journal=. 2019 , publisher=

2019
[64]

Proceedings of the AAAI conference on artificial intelligence , volume=

Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
[65]

International conference on machine learning , pages=

Dueling network architectures for deep reinforcement learning , author=. International conference on machine learning , pages=. 2016 , organization=

2016
[66]

Zhang, Jipeng and Qin, Yaxuan and Pi, Renjie and Zhang, Weizhong and Pan, Rui and Zhang, Tong , booktitle=
[67]

Advances in neural information processing systems , volume=

Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning , author=. Advances in neural information processing systems , volume=
[68]

Memorization vs

Elangovan, Aparna and He, Jiayuan and Verspoor, Karin. Memorization vs. Generalization : Quantifying Data Leakage in NLP Performance Evaluation. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 2021. doi:10.18653/v1/2021.eacl-main.113

work page doi:10.18653/v1/2021.eacl-main.113 2021
[69]

2025 , url=

Daniel P Jeong and Zachary Chase Lipton and Pradeep Kumar Ravikumar , journal=. 2025 , url=

2025