Modeling and forecasting subnational age distribution of death counts
Pith reviewed 2026-05-22 22:40 UTC · model grok-4.3
The pith
A cumulative distribution function transformation improves forecasts of subnational age distributions of death counts.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The age distribution of death counts resembles probability density functions and therefore occupies a constrained nonlinear space. Applying a cumulative distribution function transformation, which is scale-free and preserves monotonicity, allows standard forecasting methods to generate more accurate forecasts of subnational death distributions than methods that treat the data as unconstrained.
What carries the argument
The cumulative distribution function transformation applied to age distributions of death counts, which converts them into a form amenable to forecasting while remaining scale-free and monotonicity-preserving.
If this is right
- More accurate forecasts of life-table death counts at subnational scales
- Improved estimation of regional age-specific survival probabilities
- Better subnational life expectancy calculations
- A practical basis for actuaries to explore annuity pricing across ages and maturities
Where Pith is reading between the lines
- The transformation approach could be tested on subnational data from countries with varying data quality to check robustness
- It might be combined with joint models that forecast both population size and death distributions simultaneously
- The method could help isolate the effect of data noise in lower-quality subnational regions
Load-bearing premise
The age distribution of death counts can be treated as living in a constrained nonlinear space where the CDF transformation stays scale-free and monotonicity-preserving in a way that keeps the forecasts relevant.
What would settle it
A direct accuracy comparison on the Japanese subnational life-table data in which forecasts that skip the CDF transformation match or exceed the accuracy of those that use it.
read the original abstract
Existing mortality forecasting methods focus on age-specific mortality rates, which lie in an unconstrained space and overlook the distributional nature of life-table death counts. Few studies have developed and compared forecasting methods that model the shape and dynamics of the age distribution of deaths, especially at the subnational level, where data quality varies greatly. This paper presents several forecasting methods to model and forecast the subnational age distribution of death counts. The age distribution of death counts has many similarities to probability density functions, which are non-negative and have a constrained integral, and thus live in a constrained nonlinear space. To address the nonlinear nature of objects, we implement a cumulative distribution function transformation that is scale-free and has additional monotonicity. Using subnational Japanese life-table death counts from the Japanese Mortality Database (2025), we evaluate the forecast accuracy of the transformation and forecasting methods. The improved forecast accuracy of life-table death counts implemented here will be of great interest to demographers in estimating regional age-specific survival probabilities and life expectancy, and to actuaries as a foundation for exploring potential applications in determining annuity prices for various ages and maturities.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops and compares forecasting methods for the subnational age distribution of death counts by applying a cumulative distribution function (CDF) transformation to handle the constrained nonlinear space analogous to probability densities. Using subnational Japanese life-table death counts from the Japanese Mortality Database, it evaluates forecast accuracy and claims improvements over methods that ignore the distributional nature, with applications to regional survival probabilities, life expectancy, and actuarial pricing.
Significance. If the central claim holds after addressing the transformation details, the work would advance compositional forecasting in demography by explicitly modeling the shape of death distributions rather than unconstrained rates. The subnational focus with variable data quality and use of an external database are strengths that could support reproducible applications in regional mortality analysis.
major comments (2)
- [Abstract] Abstract: the central claim that the CDF transformation is scale-free and monotonicity-preserving in a way that preserves forecast relevance upon inversion is load-bearing, yet the abstract (and by extension the methods) provides no explicit description of how totals are handled or the exact inversion procedure; this leaves open whether back-transformed forecasts remain superior when subnational totals vary and data quality is heterogeneous.
- [Evaluation] Evaluation section: the reported accuracy gains must be accompanied by a full accounting of all candidate methods and pre-specified selection criteria; without this, the comparison risks post-hoc selection that could inflate apparent improvements over direct age-specific approaches.
minor comments (2)
- [Abstract] Abstract: the citation 'Japanese Mortality Database (2025)' appears to reference a future or misdated source; clarify the exact data vintage and access details.
- Notation: ensure consistent use of symbols for the transformed CDF and its inverse across equations and figures to avoid ambiguity in the back-transformation step.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments on our manuscript. We address each major comment below and have revised the manuscript to improve clarity and transparency where needed.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that the CDF transformation is scale-free and monotonicity-preserving in a way that preserves forecast relevance upon inversion is load-bearing, yet the abstract (and by extension the methods) provides no explicit description of how totals are handled or the exact inversion procedure; this leaves open whether back-transformed forecasts remain superior when subnational totals vary and data quality is heterogeneous.
Authors: We agree that the abstract would benefit from additional detail on these points to strengthen the central claim. The CDF transformation, as implemented in Section 3, normalizes the cumulative death counts by the region-year total, rendering it scale-free by construction. Inversion proceeds by forecasting the CDF, recovering proportions via first differences, and rescaling by an independently forecasted total (obtained via univariate time-series modeling of the raw totals). We have revised the abstract to briefly describe the inversion and handling of totals, and we have added a clarifying paragraph in the methods section with a worked numerical example. This revision also notes applicability under heterogeneous data quality, as the normalization is performed separately per subnational unit. revision: yes
-
Referee: [Evaluation] Evaluation section: the reported accuracy gains must be accompanied by a full accounting of all candidate methods and pre-specified selection criteria; without this, the comparison risks post-hoc selection that could inflate apparent improvements over direct age-specific approaches.
Authors: We acknowledge the importance of full transparency to avoid any perception of post-hoc selection. The evaluation in Section 4 considered a pre-specified suite of methods drawn from the compositional data and mortality forecasting literature (including direct age-specific ARIMA/ETS models, log-ratio transformations, and functional approaches), with model selection and accuracy assessment based on fixed out-of-sample MAE criteria established prior to analysis. To address the referee's concern directly, we have added an explicit subsection and supplementary table in the revised manuscript that enumerates every candidate method considered, the a priori inclusion criteria, and the exact selection protocol used. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper applies a standard CDF transformation to age distributions of death counts (treated as analogous to densities) and evaluates multiple forecasting methods on external data from the Japanese Mortality Database. No load-bearing steps reduce to self-definition, fitted inputs renamed as predictions, or self-citation chains. The central claim rests on empirical forecast accuracy comparisons rather than any derivation that is equivalent to its inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Age distribution of death counts lives in a constrained nonlinear space similar to probability density functions (non-negative, fixed integral).
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.