pith. sign in

arxiv: 2605.16365 · v1 · pith:7JSWEVRQnew · submitted 2026-05-09 · 💻 cs.LG · cs.DB

Machine Learning-Based Pre-Test Risk Stratification for PCR-Confirmed Chlamydia Using Patient-Reported Data and Urine Biomarkers

Pith reviewed 2026-05-20 22:06 UTC · model grok-4.3

classification 💻 cs.LG cs.DB
keywords machine learningchlamydiarisk stratificationurine biomarkerspre-test predictionPCR testingsupervised classificationnon-invasive screening
0
0 comments X

The pith

Machine learning models using patient-reported data and urine biomarkers can stratify pre-test risk for PCR-confirmed chlamydia.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether machine learning can predict who is likely to test positive for chlamydia using only non-invasive inputs before any molecular test is run. It trains five classifiers on 93 urine samples with known PCR results, comparing patient history and symptoms alone, standard urine measurements alone, and the two sources together. Patient-reported data alone reached moderate accuracy while urine biomarkers alone gave steadier results across models. Adding both sources together produced the most consistent performance and reduced differences between classifiers. This approach could let screening programs decide who needs a full PCR test based on cheap, routine information, including in home or decentralized settings.

Core claim

Models using the combination of patient-reported history and symptoms with urine biomarkers from standard urinalysis achieved marginally higher peak discrimination and reduced performance variability across classifiers compared with either feature group alone, showing that urine biomarkers supply a reliable and complementary predictive signal for pre-test risk stratification of PCR-confirmed Chlamydia trachomatis infection.

What carries the argument

Supervised classifiers trained on the integrated set of patient-reported features and urine biomarker features, evaluated with stratified 5-fold cross-validation and out-of-fold probability estimates.

If this is right

  • Non-invasive patient-reported data and urine tests can be combined to prioritize who receives PCR confirmation in resource-limited screening.
  • Screening workflows can incorporate pre-test models to reduce unnecessary molecular tests while maintaining detection rates.
  • Urine biomarkers alone already deliver consistent risk signals that complement what patients report.
  • Feature integration makes model performance more stable across different classifiers and thresholds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Programs could lower overall testing costs by routing only high-risk individuals to PCR based on these routine inputs.
  • Home-based or remote collection kits might add simple urine dipstick results to improve triage accuracy before lab processing.
  • Larger multi-site datasets would be needed to check whether the same feature weights hold in different age groups or regions.

Load-bearing premise

The 93 curated urine samples with PCR labels represent the broader population and the cross-validation procedure produces stable, generalizable performance without substantial overfitting.

What would settle it

An external validation study on an independent cohort of several hundred patients in which the combined-feature model yields an AUC below 0.65 or shows high variability across folds would show the predictive signals are not reliable for clinical use.

Figures

Figures reproduced from arXiv: 2605.16365 by Katrin Krolov, Marko Lehes, Mehrab Mahdian, Tamas Pardy.

Figure 1
Figure 1. Figure 1: Age distribution of the post-cleaning cohort. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: AUC point estimates with 95% bootstrap confidence intervals for [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Sensitivity versus specificity for all models and feature sets based [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
read the original abstract

Early identification of individuals at elevated risk of Chlamydia trachomatis infection may enable optimal use of molecular testing in resource-aware screening. We evaluate the feasibility of pre-test risk stratification (PTRS) using machine-learning models trained on routinely available, non-invasive clinical data. A curated dataset of 93 urine samples with PCR reference labels was analyzed using three feature groups: patient-reported history and symptoms, urine biomarkers from standard urinalysis, and their combination. Five supervised classifiers were evaluated using stratified 5-fold cross-validation with out-of-fold probability estimates. Performance was assessed using area under the receiver operating characteristic curve (AUC) and threshold-dependent metrics, with uncertainty quantified via bootstrap confidence intervals. Models using only patient-reported data showed moderate discrimination (AUC up to 0.72). Urine biomarker-based models demonstrated slightly lower peak discrimination but more consistent performance, with ensemble methods yielding the strongest results. Combining feature groups marginally increased the peak AUC and reduced performance variability across models, indicating improved robustness. Findings indicate that urine biomarkers provide a reliable predictive signal for PTRS that is complementary to patient-reported information, while feature integration enhances robustness. This work supports the integration of non-invasive, routinely available information for PTRS into screening workflows, including decentralized or home-based PCR contexts, to optimize testing prioritization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The manuscript presents a machine learning approach for pre-test risk stratification (PTRS) of Chlamydia trachomatis using patient-reported data and urine biomarkers from a dataset of 93 PCR-labeled urine samples. It evaluates five supervised classifiers with stratified 5-fold cross-validation, reporting AUC up to 0.72 for patient-reported features, slightly lower but more consistent for biomarkers, and marginal gains with combined features, concluding support for integration into screening workflows.

Significance. If the findings hold, this work could contribute to optimizing molecular testing in resource-limited or home-based settings by leveraging routinely available non-invasive data. The use of out-of-fold predictions and bootstrap confidence intervals is a strength for quantifying uncertainty in small-sample settings.

major comments (3)
  1. Abstract: The peak AUC is described only as 'up to 0.72' without exact values, per-model results, or bootstrap CIs for each feature group; this prevents evaluation of the claimed marginal AUC increase and reduced variability from feature integration.
  2. Abstract / Methods: With n=93 and stratified 5-fold CV, out-of-fold AUC estimates lack supporting power analysis or external validation; the claims of a 'reliable predictive signal' that is 'complementary' and yields 'enhanced robustness' rest on potentially unstable small-sample performance that may not generalize.
  3. Abstract: No demographic breakdown, explicit list of urine biomarkers, or patient-reported features is provided, which limits assessment of whether the curated set is representative and whether implicit feature curation inflates apparent complementarity.
minor comments (1)
  1. Abstract: 'Ensemble methods yielding the strongest results' is stated without naming the ensembles or providing comparative metrics against the five individual classifiers.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for their thoughtful review and constructive feedback on our manuscript. We have carefully considered each comment and made revisions to enhance the clarity, transparency, and acknowledgment of limitations in the work.

read point-by-point responses
  1. Referee: Abstract: The peak AUC is described only as 'up to 0.72' without exact values, per-model results, or bootstrap CIs for each feature group; this prevents evaluation of the claimed marginal AUC increase and reduced variability from feature integration.

    Authors: We agree that more precise reporting in the abstract would facilitate better evaluation of our results. In the revised version, we have updated the abstract to include the specific peak AUC values for each feature group along with their bootstrap confidence intervals. Per-model results remain detailed in the main text and tables, but we now reference the key metrics more explicitly in the abstract to highlight the marginal gains and reduced variability from feature integration. revision: yes

  2. Referee: Abstract / Methods: With n=93 and stratified 5-fold CV, out-of-fold AUC estimates lack supporting power analysis or external validation; the claims of a 'reliable predictive signal' that is 'complementary' and yields 'enhanced robustness' rest on potentially unstable small-sample performance that may not generalize.

    Authors: The small sample size is indeed a limitation of this feasibility study. We have added a post-hoc power analysis in the Methods section to justify the stratified 5-fold cross-validation approach for this n. We maintain that the out-of-fold predictions and bootstrap CIs provide a reasonable quantification of uncertainty. However, external validation on an independent dataset was not available in this study; we have expanded the Discussion section to explicitly discuss the potential for overfitting and the need for prospective validation in larger, multi-site cohorts to confirm generalizability. revision: partial

  3. Referee: Abstract: No demographic breakdown, explicit list of urine biomarkers, or patient-reported features is provided, which limits assessment of whether the curated set is representative and whether implicit feature curation inflates apparent complementarity.

    Authors: We acknowledge that the abstract omitted these details for brevity. The revised abstract now includes a high-level description of the patient-reported features (e.g., symptoms, sexual history) and urine biomarkers (e.g., leukocyte esterase, nitrites from urinalysis), along with basic demographic characteristics of the cohort. The complete lists and curation process are fully described in the Methods section and Supplementary Table S1 to allow assessment of representativeness and to address concerns about feature selection. revision: yes

standing simulated objections not resolved
  • The lack of an independent external validation cohort, which prevents definitive claims about generalizability beyond the current dataset.

Circularity Check

0 steps flagged

No circularity in the ML evaluation or feature integration claims

full rationale

The paper trains standard supervised classifiers on three feature groups (patient-reported data, urine biomarkers, and their combination) to predict PCR labels from a fixed set of 93 samples. Performance is reported via stratified 5-fold cross-validation using out-of-fold probability estimates and bootstrap CIs; this is a conventional, non-circular evaluation procedure that does not reduce any reported AUC or robustness metric to a fitted parameter by construction. No self-definitional equations, load-bearing self-citations, uniqueness theorems, or ansatz smuggling appear in the provided text. The central claims about complementary signal and enhanced robustness rest on empirical CV results rather than on any redefinition or renaming of the inputs themselves.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard supervised-learning assumptions plus a small curated clinical dataset; no new physical entities are introduced.

free parameters (2)
  • classifier hyperparameters
    Tuning parameters for the five supervised models are chosen during training or validation but not enumerated.
  • feature preprocessing choices
    Decisions on which patient-reported items and which urine biomarkers to include are implicit and affect the reported AUCs.
axioms (2)
  • domain assumption The 93 samples are independent and identically distributed draws from the target screening population.
    Required for cross-validation to estimate out-of-sample performance.
  • domain assumption Standard urinalysis biomarkers contain non-redundant signal for Chlamydia infection.
    Invoked when treating urine results as predictive features complementary to history.

pith-pipeline@v0.9.0 · 5778 in / 1459 out tokens · 59611 ms · 2026-05-20T22:06:01.071180+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

  1. [1]

    Chlamydia.WHO Fact Sheets(World Health Organization, Geneva, 2023)

    World Health Organization. Chlamydia.WHO Fact Sheets(World Health Organization, Geneva, 2023). https://www.who.int/news-room/ fact-sheets/detail/chlamydia

  2. [2]

    & Wang, C

    Li, C., Ong, J., Tang, W. & Wang, C. Editorial: Chlamydia trachomatis infection: Epidemiology, prevention, clinical, and basic science re- search.Frontiers in Public Health11, Article 1167690 (Frontiers Media S.A., 2023). https://www.frontiersin.org/journals/public-health/articles/ 10.3389/fpubh.2023.1167690 doi:10.3389/fpubh.2023.1167690

  3. [3]

    W., Mabey, D., Kamb, M

    Peeling, R. W., Mabey, D., Kamb, M. L., Chen, X. S., Radolf, J. D. & Benzaken, A. S. Sexually transmitted infections: challenges ahead. The Lancet Infectious Diseases17, e235–e279 (Elsevier, 2017). https: //www.sciencedirect.com/science/article/pii/S1473309917300109

  4. [4]

    Chlamydial Infections

    Centers for Disease Control and Prevention. Chlamydial Infections. In Sexually Transmitted Infections Treatment Guidelines, 2021(Centers for Disease Control and Prevention, Atlanta, 2021). https://www.cdc.gov/ std/treatment-guidelines/chlamydia.htm

  5. [5]

    Gaydos, C. A. & Van Der Pol, B. Advances in the laboratory diagnosis of Chlamydia trachomatis infections.Clinical Microbiology Reviews30, 686–709 (American Society for Microbiology, 2017). https://journals. asm.org/doi/10.1128/CMR.00024-16

  6. [6]

    Chakraborty, S. Democratizing nucleic acid-based molecular diagnostic tests for infectious diseases at resource-limited settings—from point of care to extreme point of care.Sensors & Diagnostics3, 536–561 (Royal Society of Chemistry, 2024). doi:10.1039/D3SD00304C

  7. [7]

    WHO expands guidance on sexually transmitted infections and reviews country progress on policy implementation

    World Health Organization. WHO expands guidance on sexually transmitted infections and reviews country progress on policy implementation. World Health Organization News Release, July 26,

  8. [8]

    Available: https://www.who.int/news/item/26-07-2025-who- expands-guidance-on-sexually-transmitted-infections-and-reviews- country-progress-on-policy-implementation

  9. [9]

    Bilal, M. et al. Machine learning prediction of incident sexually trans- mitted infections among men who have sex with men: a cohort study. The Lancet Digital Health3, e642–e651 (2021)

  10. [10]

    E., Pearson, W

    Hao, S., Vel ´asquez, E. E., Pearson, W. S., Hoover, K. W., Zhu, W., Rochlin, I.et al.Primary care screening for sexually transmitted infec- tions in the United States from 2019 to 2021.PLoS One20, e0325097 (2025). https://doi.org/10.1371/journal.pone.0325097

  11. [11]

    https://sti.bmj.com/content/92/ 6/433

    van Klaveren, D.et al.Prediction of Chlamydia trachomatis infection to facilitate selective screening: model development and validation using registry and questionnaire data.Sexually Transmitted Infections92, 433–440 (BMJ Publishing Group, 2016). https://sti.bmj.com/content/92/ 6/433

  12. [12]

    C., Hoffmann, S., Nardone, A., Parajuli, A.et al

    Jensen, S. C., Hoffmann, S., Nardone, A., Parajuli, A.et al. Technologies, strategies and approaches for testing populations at risk of sexually transmitted infections: a systematic review protocol. Systematic Reviews14, Article 107 (2025). https://springermedicine. com/chlamydia-trachomatis/sexually-transmitted-infection/ technologies-strategies-and-appr...

  13. [13]

    & Ferreira, H

    Sequeira-Antunes, B. & Ferreira, H. A. Urinary biomarkers and point-of- care urinalysis devices for early diagnosis and management of disease: a review.Biomedicines11, 1051 (MDPI, 2023). https://doi.org/10.3390/ biomedicines11041051

  14. [14]

    & Kohane, I

    Rajkomar, A., Dean, J. & Kohane, I. Machine learning in medicine.New England Journal of Medicine380, 1347–1358 (2019). https://doi.org/10. 1056/NEJMra1814259

  15. [15]

    A., King, C., McCoy, J., Doros, G., Zhang, X.et al

    Bilal, M., Ko, Y . A., King, C., McCoy, J., Doros, G., Zhang, X.et al. Machine learning prediction of incident sexually transmitted infections among men who have sex with men: a cohort study.Lancet Digital Health3, e642–e651 (2021). https://doi.org/10.1016/S2589-7500(21) 00102-6