pith. sign in

arxiv: 2604.07974 · v1 · submitted 2026-04-09 · 📊 stat.AP

Socio-demographic inequalities in the maximum human lifespan

Pith reviewed 2026-05-10 18:11 UTC · model grok-4.3

classification 📊 stat.AP
keywords maximum lifespanextreme value theorysocio-demographic inequalitiesage at deathupper limitBelgiumNetherlandslongevity
0
0 comments X

The pith

Microdata from Belgium and the Netherlands show that the human lifespan has a finite upper limit that differs by sex, origin, education, and living arrangements.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests for the existence of a maximum human lifespan and whether that maximum varies across socio-demographic groups using complete individual-level death and population records for everyone aged 90 and older in Belgium and the Netherlands from 1995 to 2022. It models the extreme upper tail of the age-at-death distributions with tools from extreme value theory and compares the resulting endpoints across categories of sex, origin, civil status, household type, and education level. A sympathetic reader would care because the results address the long-running debate over whether lifespan is biologically capped and reveal persistent inequalities that reach the oldest ages. The analysis concludes that men reach lower maxima than women, that widowed and institutionalized individuals have shorter maxima, and that non-Western European origin and higher education are linked to longer maxima.

Core claim

Using extreme value theory on microdata covering all deaths and residents aged 90+ in Belgium and the Netherlands between 1995 and 2022, the study finds statistical evidence for a finite upper limit to lifespan and shows that this limit is lower for men than for women, lower for widowed persons and those living in institutions, and higher for individuals of non-Western European origin and those with higher educational attainment.

What carries the argument

Extreme value theory models fitted to the upper tail of age-at-death distributions to estimate and test differences in finite upper endpoints across socio-demographic subgroups.

If this is right

  • The upper limit to lifespan is not uniform but depends on sex, origin, education, and household situation.
  • Socio-demographic inequalities in survival persist into the extreme tail of the age distribution.
  • Projections of the number of future centenarians and supercentenarians should incorporate group-specific maxima.
  • The mechanisms producing these tail inequalities could be investigated through targeted studies of health, behavior, or selection effects.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the group differences arise from modifiable social factors, then interventions improving education or support for widowed persons might shift the observed maxima upward.
  • Repeating the analysis on sub-periods of the data could test whether the estimated limits are stable or changing over time.
  • Comparable microdata from other countries would show whether the same pattern of inequalities holds outside Belgium and the Netherlands.

Load-bearing premise

The observed ages at death for the oldest individuals accurately reflect the true upper limit without substantial distortion from incomplete records, migration, or changes in data practices over the study period.

What would settle it

A verified new record age at death well above the model's estimated upper limits for any group, or new data showing the tail of the distribution becoming heavier rather than bounded.

Figures

Figures reproduced from arXiv: 2604.07974 by Jens Robben, Torsten Kleinow.

Figure 1
Figure 1. Figure 1: Barplots of the socio-demographic covariates at different threshold ages in Belgium. Source: Statbel and authors’ calculations. In contrast to Belgium, the proportion of right-censored observations is slightly lower, and almost all censored observations are in the final observation year (99.7%), which again explains the large number of observations in the final calendar period. Similar to Belgium, the tabl… view at source ↗
Figure 2
Figure 2. Figure 2: Barplots of the socio-demographic covariates at different threshold ages in the Nether￾lands. Source: CBS and authors’ calculations. 3 Methods 3.1 Extreme value theory framework To study the upper tail of the lifespan distribution, we rely on extreme value theory (EVT). Let X be a random variable that denotes the age at death of a particular individual. Under some mild regularity conditions as stated in th… view at source ↗
Figure 3
Figure 3. Figure 3: Generalized Pareto Q–Q plots for Belgium (left panel) and the Netherlands (right panel) at a threshold age of 100 years. Source: Statbel, CBS, and authors’ calculations. To assess robustness with respect to the threshold age u, we repeat the analysis using alternative threshold ages and report the results in the Suppl. Mat. A. 4.2 Estimated scale and shape parameters We estimate the shape parameter, or ext… view at source ↗
Figure 4
Figure 4. Figure 4: Left panels: Estimated parameters of the socio-demographic covariates in the scale pa￾rameter of the generalized Pareto distribution for Belgium (top) and the Netherlands (bottom), with 95% confidence intervals. Right panels: boxplot of the estimated scale parameter values for all individuals in our data. Source: Statbel, CBS, and authors’ calculations. civ: Considering civil status, individuals who are un… view at source ↗
Figure 5
Figure 5. Figure 5: Density of the estimated maximum lifespans across socio-demographic profiles with a frequency greater than 10 in the data set for Belgium (left) and the Netherlands (right). The density is weighted by the observed frequency of each profile in the respective dataset. Dashed vertical lines indicate the minimum and maximum estimated lifespans, while the solid vertical line denotes the median. Source: Statbel,… view at source ↗
Figure 6
Figure 6. Figure 6: shows the estimated shape parameters of the generalized Pareto distribution for threshold ages 98–102 in Belgium (left panel) and the Netherlands (right panel). Although there appears to be a mild increasing trend in the point estimates of the shape parameter as the threshold age increases, the width of the confidence intervals increases substantially for higher thresholds, reflecting the smaller number of… view at source ↗
Figure 7
Figure 7. Figure 7: Estimated regression coefficients βσ in the log-scale parameter logσi = βTσ zi of the Generalized Pareto distribution for different threshold ages (i.e., ages 98-102) in Belgium. Point estimates are shown together with 95% confidence intervals. Source: Statbel and authors’ calculations. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Estimated regression coefficients βσ in the log-scale parameter logσi = βTσ zi of the Generalized Pareto distribution for different threshold ages (i.e., ages 98-102) in the Netherlands. Point estimates are shown together with 95% confidence intervals. Source: CBS and authors’ calculations. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_8.png] view at source ↗
read the original abstract

The existence of an upper limit to the human lifespan has been widely debated, with studies offering both supporting and opposing evidence. Using unique individual-level death and population records for individuals aged 90 and older in Belgium and the Netherlands between 1995 and 2022, we provide statistical evidence supporting the existence of an upper limit. A related yet unexplored question is whether this life span limit differs across socio-demographic groups. Our microdata include information on the sex, origin, civil status, type of household, and education level of each individual. Using tools from extreme value theory, we quantify and compare the upper tail of human lifespan distributions across these socio-demographic characteristics. We find that men have a statistically lower maximum lifespan than women and that individuals who are widowed or live in institutional households have a clearly lower maximum lifespan. Finally, individuals of non-Western European origin and those with higher educational attainment exhibit longer maximum lifespans.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper uses individual-level Belgian and Dutch administrative microdata on deaths and population for ages 90+ (1995–2022) together with extreme-value theory to argue that the human lifespan distribution has a finite upper endpoint and that this endpoint differs systematically by sex, origin, civil status, household type, and education. It reports that men, the widowed, and those in institutional households exhibit shorter maxima while non-Western European origin and higher education are associated with longer maxima.

Significance. If the tail estimates are robust to data-quality issues, the work supplies one of the largest microdata-based EVT applications to the lifespan-limit debate and supplies the first systematic socio-demographic comparison of upper endpoints. The availability of linked individual records rather than aggregated life tables is a clear methodological advantage.

major comments (3)
  1. [Data section] Data section: the manuscript provides no description of age-validation protocols, completeness checks for deaths at 105+, or handling of possible net migration or registration changes across the 1995–2022 window. Because the central claim of a finite upper limit rests on the observed right tail accurately reflecting the conditional distribution, the absence of these checks is load-bearing.
  2. [Methods section] Methods section: threshold selection, model choice (e.g., GEV vs. GPD), and goodness-of-fit diagnostics for the EVT fits are not reported. Without these, it is impossible to assess whether the reported endpoint estimates and group differences are sensitive to post-hoc modeling decisions.
  3. [Results section] Results section: no period-specific sensitivity analyses or comparison with external sources (e.g., census life tables or HMD data) are shown. Such checks are required to rule out spurious finite-endpoint signals induced by evolving data-collection practices.
minor comments (2)
  1. [Abstract] The abstract and introduction should explicitly state the sample sizes for each socio-demographic stratum to allow readers to judge precision of the tail estimates.
  2. [Methods section] Notation for the endpoint parameter and shape parameter should be introduced once and used consistently; current usage mixes “maximum lifespan” and EVT terminology without a clear mapping.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their insightful comments on our manuscript. We address each of the major comments below and have made revisions to the manuscript to incorporate the suggested improvements.

read point-by-point responses
  1. Referee: [Data section] Data section: the manuscript provides no description of age-validation protocols, completeness checks for deaths at 105+, or handling of possible net migration or registration changes across the 1995–2022 window. Because the central claim of a finite upper limit rests on the observed right tail accurately reflecting the conditional distribution, the absence of these checks is load-bearing.

    Authors: We acknowledge the importance of documenting data quality for the extreme tail. In the revised manuscript, we have added a new paragraph in the Data section detailing the age validation procedures used by the Belgian and Dutch statistical agencies, which involve linkage to birth records and are considered highly reliable for ages 90+. We also report the number of deaths observed at ages 105+ and above, and discuss that registration changes during the period were minimal for this age group, with no evidence of systematic under- or over-reporting. Net migration for individuals over 90 is very low in these countries, as confirmed by the data providers. revision: yes

  2. Referee: [Methods section] Methods section: threshold selection, model choice (e.g., GEV vs. GPD), and goodness-of-fit diagnostics for the EVT fits are not reported. Without these, it is impossible to assess whether the reported endpoint estimates and group differences are sensitive to post-hoc modeling decisions.

    Authors: We agree that these methodological details are essential for reproducibility and robustness assessment. The revised manuscript now includes a dedicated subsection on EVT implementation: we specify the use of the Generalized Pareto Distribution (GPD) for modeling exceedances, with thresholds selected based on mean excess plots and stability of parameter estimates across a range of thresholds (90th to 95th percentiles). Goodness-of-fit is assessed via QQ-plots, PP-plots, and the Anderson-Darling test, with results reported in the appendix. Sensitivity analyses to threshold choice are also presented, showing that the endpoint estimates and group differences remain consistent. revision: yes

  3. Referee: [Results section] Results section: no period-specific sensitivity analyses or comparison with external sources (e.g., census life tables or HMD data) are shown. Such checks are required to rule out spurious finite-endpoint signals induced by evolving data-collection practices.

    Authors: We have added these checks to the revised manuscript. Period-specific analyses are now included by estimating the model separately for 1995-2008 and 2009-2022, confirming that the finite upper limits and socio-demographic inequalities are robust across sub-periods. Additionally, we compare our aggregate endpoint estimates with those from the Human Mortality Database (HMD) for Belgium and the Netherlands over the same period, finding close agreement in the overall maximum lifespan (around 115 years), which supports that our results are not driven by data artifacts. While HMD does not provide socio-demographic breakdowns, this aggregate validation is reassuring. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical EVT estimates from external microdata

full rationale

The paper applies standard extreme-value theory methods directly to individual-level Belgian and Dutch administrative death records (1995-2022) to estimate finite upper limits and socio-demographic differences. These are data-driven inferences with no self-definitional steps, no fitted parameters re-labeled as predictions, and no load-bearing self-citations that reduce the central claims to their own inputs. The derivation remains self-contained against the observed tail data.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis rests on standard extreme value theory assumptions for modeling maxima of age-at-death distributions and on the completeness of the administrative death and population registers for ages 90+.

axioms (1)
  • domain assumption The upper tail of the age-at-death distribution belongs to the domain of attraction of one of the extreme value distributions.
    Invoked by the choice of extreme value theory tools for estimating the upper endpoint.

pith-pipeline@v0.9.0 · 5449 in / 1347 out tokens · 47912 ms · 2026-05-10T18:11:48.651032+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

  1. [1]

    Aarssen, K., & De Haan, L. (1994). On the maximal life span of humans.Mathematical Population Studies,4(4), 259–281. doi: 10.1080/08898489409525379 Balkema, A. A., & De Haan, L. (1974). Residual life time at great age.The Annals of probability, 792–804. doi: 10.1214/aop/1176996548 Bongaarts, J. (2006). How long will we live?Population and development revi...

  2. [2]

    −δ i + 1 ξ +δ i ξyie−βT σ zi 1 +ξy ie−βT σ zi − ξaie−βT σ zi 1 +ξa ie−βT σ zi # zi,(B.2) and the derivative with respect to the extreme value indexξequals: ∂ℓ(θ) ∂ξ = X i:yi>0

    Hence, the substantive conclusions regarding socio-demographic effects remain unchanged. Only for the covariateorgin Belgium does there appear to be a more pronounced upward trend in the point estimates. This behavior is driven by the limited number of individuals with a non-Western European origin in the sample and by the fact that several of the most ex...