pith. sign in

arxiv: 2512.15056 · v3 · submitted 2025-12-17 · 📊 stat.AP

Routine Blood Biomarkers Reveal a Preclinical Continuum of Multiple Myeloma Risk

Pith reviewed 2026-05-16 22:00 UTC · model grok-4.3

classification 📊 stat.AP
keywords multiple myelomapreclinical biomarkersroutine blood testsprotein dysregulationrisk predictionalbumin-to-globulin ratiohematopoietic changes
0
0 comments X

The pith

Routine blood tests detect protein and blood cell changes that precede multiple myeloma diagnosis by more than ten years.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Researchers analyzed routine blood measurements from nearly 300,000 cancer-free UK Biobank participants followed for a median of 12.4 years, during which 768 developed multiple myeloma. They found that markers of protein dysregulation, including elevated total protein, depressed albumin, and a low albumin-to-globulin ratio, along with signs of erythropoietic suppression and immune cell shifts, were linked to later diagnosis. These deviations emerged more than a decade before clinical onset and intensified as the disease approached. Adding the biomarkers to a clinical risk model raised ten-year discrimination from a C-index of 0.684 to 0.744. The approach uses tests already performed in standard care and concentrates risk among individuals with extreme marker values.

Core claim

In a prospective analysis of 299,035 cancer-free participants, a biomarker-wide scan across 61 routine blood analytes identified markers of protein dysregulation and hematopoietic changes that associate with incident multiple myeloma, with longitudinal trajectories showing these multi-system deviations detectable more than ten years before diagnosis and nonlinear dose-response patterns concentrating risk at extreme values.

What carries the argument

Biomarker-wide association scan across routine hematological, protein, renal, and immune analytes that tracks preclinical protein metabolism perturbations and coordinated blood cell shifts as early risk signals.

If this is right

  • Incorporating significant biomarkers improved 10-year MM discrimination from a C-index of 0.684 to 0.744.
  • The high-risk decile accumulated 0.79% cumulative incidence versus 0.47% under the clinical model alone.
  • Longitudinal analyses showed multi-system deviations emerge more than a decade before diagnosis and intensify toward onset.
  • Dose-response modelling revealed nonlinear associations with risk concentrated at extreme values of protein and erythrocytic markers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Routine panels could support targeted surveillance in primary care for individuals showing early deviations without requiring specialized testing.
  • Replication across diverse populations would test whether the same markers retain predictive strength outside the original cohort.
  • Electronic health record integration might enable automated flagging of rising risk based on serial routine results.

Load-bearing premise

The observed biomarker associations reflect true preclinical biological changes rather than confounding by unmeasured factors or reverse causation, and the risk model generalizes beyond the study population.

What would settle it

Absence of the same biomarker associations with future multiple myeloma in an independent prospective cohort followed for comparable duration would disprove the existence of a detectable preclinical continuum using routine tests.

Figures

Figures reproduced from arXiv: 2512.15056 by Bingjie Li, Feiyue Pan, Jiadai Xu, Peng Liu, Shing-Tung Yau, Yiqing Sun, Zhigang Yao.

Figure 1
Figure 1. Figure 1: Nonlinear associations between baseline biomarker levels and multiple myeloma incidence. [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
read the original abstract

Multiple myeloma (MM) is preceded by a long preclinical phase spanning decades, yet scalable, non-specialist tools to identify individuals at elevated risk before end-organ damage are lacking. In a prospective analysis of 299,035 cancer-free UK Biobank participants followed for a median of 12.4 years, during which 768 developed incident MM, we conducted a biomarker-wide association scan across 61 routinely measured blood analytes spanning hematological, protein metabolism, renal, and immune categories. Markers of protein dysregulation-elevated total protein, depressed albumin, and a low albumin-to-globulin (A/G) ratio-showed the strongest preclinical associations (hazard ratios 0.61-1.54 per SD), consistent with progressive monoclonal immunoglobulin accumulation and suppression of normal polyclonal synthesis years before diagnosis. These signals were accompanied by indicators of erythropoietic suppression, morphological red cell dysregulation, and a shift toward lower neutrophil and higher lymphocyte fractions, reflecting coordinated perturbations across hematopoietic and immune compartments. Longitudinal trajectory analyses showed that these multi-system deviations emerge more than a decade before diagnosis and intensify as clinical onset approaches. Dose-response modelling revealed pronounced nonlinear associations for protein and erythrocytic markers, with risk concentrated among individuals with extreme values. Incorporating significant biomarkers into a clinical risk model improved 10-year MM discrimination from a C-index of 0.684 to 0.744, with the high-risk decile accumulating 0.79% cumulative incidence versus 0.47% under the clinical model alone. These findings provide a practical framework for biomarker-guided MM risk stratification and targeted surveillance using routinely available clinical tests.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript analyzes 299,035 cancer-free UK Biobank participants followed for a median 12.4 years (768 incident MM cases). It reports a biomarker-wide scan across 61 routine blood analytes, identifies strongest preclinical associations with protein-dysregulation markers (elevated total protein, depressed albumin, low A/G ratio; HRs 0.61–1.54 per SD), documents longitudinal trajectories emerging >10 years pre-diagnosis, and shows that adding selected biomarkers to a clinical model raises 10-year MM discrimination (C-index 0.684 to 0.744) while concentrating 10-year cumulative incidence in the top decile (0.79% vs 0.47%).

Significance. If the reported discrimination gain and decile separation reflect out-of-sample performance, the work supplies a scalable, routine-lab-based framework for identifying individuals at elevated MM risk years before clinical onset. The large prospective cohort, direct testing against incident cases, and multi-system biomarker trajectories constitute clear strengths for an applied statistical study of preclinical risk stratification.

major comments (2)
  1. [Results, discrimination and decile analysis] Results section on risk-model performance: the C-index increase from 0.684 to 0.744 and the high-risk decile incidence figures (0.79% vs 0.47%) are presented without cross-validation, split-sample, bootstrap, or temporal-holdout evaluation. Because biomarkers were selected post-hoc from a 61-analyte scan within the same 299k cohort, the reported improvement risks in-sample optimism and may not generalize.
  2. [Methods, statistical modeling] Methods, statistical modeling subsection: the construction of the base clinical model and the biomarker-augmented model does not specify whether variable selection used penalized regression, pre-specified covariates, or any internal validation procedure before computing the C-index and cumulative incidence estimates.
minor comments (2)
  1. [Abstract and Results] The abstract and results do not state the exact set of clinical covariates included in the base model or whether the C-index was time-dependent.
  2. [Discussion] Discussion of residual confounding and reverse causation is brief; explicit sensitivity analyses (e.g., lag-time restriction or comorbidity adjustment) would strengthen the preclinical interpretation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which have helped us strengthen the statistical presentation of our risk modeling results. We address each major comment below and have revised the manuscript accordingly.

read point-by-point responses
  1. Referee: [Results, discrimination and decile analysis] Results section on risk-model performance: the C-index increase from 0.684 to 0.744 and the high-risk decile incidence figures (0.79% vs 0.47%) are presented without cross-validation, split-sample, bootstrap, or temporal-holdout evaluation. Because biomarkers were selected post-hoc from a 61-analyte scan within the same 299k cohort, the reported improvement risks in-sample optimism and may not generalize.

    Authors: We agree that the original submission did not include internal validation, leaving the performance estimates susceptible to optimism. In the revised manuscript we have added a split-sample validation (random 70/30 training/test partition). Biomarker selection and model fitting were performed exclusively in the training set; discrimination and cumulative incidence were then evaluated in the held-out test set. The improvement in C-index and the separation in the high-risk decile are preserved, although modestly attenuated, in the independent test sample. These validation results are now reported in the main text and a new supplementary table. revision: yes

  2. Referee: [Methods, statistical modeling] Methods, statistical modeling subsection: the construction of the base clinical model and the biomarker-augmented model does not specify whether variable selection used penalized regression, pre-specified covariates, or any internal validation procedure before computing the C-index and cumulative incidence estimates.

    Authors: We thank the referee for highlighting this lack of detail. The base clinical model used a pre-specified set of covariates (age, sex, BMI, smoking status, and family history of hematological malignancy). The biomarker-augmented model added the top protein-dysregulation markers identified by univariate Cox regression (p < 0.01) without penalization or regularization. No internal validation was performed in the original analysis. The revised Methods section now explicitly describes this procedure. In addition, the split-sample validation described in response to the first comment has been incorporated to provide validated performance estimates. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper conducts a biomarker-wide scan across 61 analytes in the UK Biobank cohort, identifies associations via hazard ratios from Cox models, and reports an in-cohort C-index improvement after incorporating selected markers. These steps are direct empirical outputs from standard regression and concordance calculations on the observed data rather than reductions by construction to the inputs. No self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations appear in the abstract or described methods. The central claims remain independent statistical associations and model fits without tautological equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The analysis rests on standard statistical modeling assumptions and data quality in a large biobank cohort without introducing new free parameters or invented entities beyond conventional Cox models and biomarker selection.

axioms (2)
  • standard math Cox proportional hazards assumptions hold for time-to-event biomarker associations.
    Hazard ratios are reported for incident MM.
  • domain assumption Routine blood analytes accurately reflect underlying physiological states without substantial measurement error.
    Used for all 61 analytes in the scan.

pith-pipeline@v0.9.0 · 5612 in / 1257 out tokens · 92978 ms · 2026-05-16T22:00:49.427355+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages

  1. [1]

    M. A. Dimopoulos, E. Terpos, M. Boccadoro, et al. Eha-emn evidence-based guidelines for diagnosis, treatment and follow-up of patients with multiple myeloma.Nature Reviews Clinical Oncology, 22(9):680–700, 2025

  2. [2]

    W. Wang, J. Li, Y . Yang, et al. Update on the outcome of m-protein screening program of multiple myeloma in china: A 7-year cohort study.Cancer Medicine, 13(1):e6859, 2024

  3. [3]

    E. K. O’Donnell, J. E. Carroll, J. Perry, et al. Distress and symptom burden in patients with monoclonal gammopathy of undetermined significance and smoldering myeloma.Blood Advances, 9(8):1984–1987, 2025

  4. [4]

    Pertesi, M

    M. Pertesi, M. Went, M. Hansson, K. Hemminki, R. S. Houlston, and B. Nilsson. Genetic predisposition for multiple myeloma.Leukemia, 34(3):697–708, 2020

  5. [5]

    Wallin and S

    A. Wallin and S. C. Larsson. Body mass index and risk of multiple myeloma: a meta-analysis of prospective studies.European Journal of Cancer, 47(11):1606–1615, 2011

  6. [6]

    B. C. Prager, Q. Xie, S. Bao, and J. N. Rich. Cancer stem cells: The architects of the tumor ecosystem.Cell Stem Cell, 24(1):41–53, 2019

  7. [7]

    A. J. Cowan, D. J. Green, M. Kwok, et al. Diagnosis and management of multiple myeloma: A review.JAMA, 327(5):464–477, 2022

  8. [8]

    L. Bao, Y . Wang, M. Lu, et al. Hypercalcemia caused by humoral effects and bone damage indicate poor outcomes in newly diagnosed multiple myeloma patients.Cancer Medicine, 9(23):8962–8969, 2020. 12 Early CRAB-like Biomarker Signatures in Myeloma

  9. [9]

    Zanwar and S

    S. Zanwar and S. V . Rajkumar. Current risk stratification and staging of multiple myeloma and related clonal plasma cell disorders.Leukemia, 2025

  10. [10]

    S. V . Rajkumar. Multiple myeloma: 2024 update on diagnosis, risk-stratification, and management.American Journal of Hematology, 99(9):1802–1824, 2024

  11. [11]

    Malard, P

    F. Malard, P. Neri, N. J. Bahlis, et al. Multiple myeloma.Nature Reviews Disease Primers, 10(1):45, 2024

  12. [12]

    Shmatko, A

    A. Shmatko, A. W. Jung, K. Gaurav, et al. Learning the natural history of human disease with generative transformers.Nature, 2025

  13. [13]

    Estrella, T

    V . Estrella, T. Chen, M. Lloyd, et al. Acidity generated by the tumor microenvironment drives local invasion. Cancer Research, 73(5):1524–1535, 2013

  14. [14]

    Sarkar, C

    S. Sarkar, C. I. Chang, J. Jean, and M. J. Wu. Tca cycle-derived oncometabolites in cancer and the immune microenvironment.Journal of Biomedical Science, 32(1):87, 2025

  15. [15]

    N. Xu, S. Bian, P. Lyu, X. He, and W. Zheng. Dynamic interplay of neuroendocrine signaling and immuno- surveillance in tumor niche remodeling.Critical Reviews in Oncology Hematology, page 104958, 2025

  16. [16]

    Ingangi, M

    V . Ingangi, M. Minopoli, C. Ragone, M. L. Motti, and M. V . Carriero. Role of microenvironment on the fate of disseminating cancer stem cells.Frontiers in Oncology, 9:82, 2019

  17. [17]

    Bartz, T

    D. Bartz, T. Chitnis, U. B. Kaiser, et al. Clinical advances in sex- and gender-informed medicine to improve the health of all: A review.JAMA Internal Medicine, 180(4):574–583, 2020

  18. [18]

    B. A. Derman, S. S. Langerman, M. Maric, et al. Sex differences in outcomes in multiple myeloma.British Journal of Haematology, 192(3):e66–e69, 2021

  19. [19]

    S. Bird, D. Cairns, T. Menzies, et al. Sex differences in multiple myeloma biology but not clinical outcomes: Results from 3894 patients in the myeloma xi trial.Clinical Lymphoma Myeloma Leukemia, 21(10):667–675, 2021

  20. [20]

    Y . D. Hong, C. D. Mullins, E. Onukwugha, et al. Association of individual low-income status and area deprivation with mortality in multiple myeloma.Journal of Geriatric Oncology, 14(2):101415, 2023. 13