pith. sign in

arxiv: 2505.19034 · v2 · submitted 2025-05-25 · 🌊 nlin.AO · nlin.CD· physics.data-an· physics.geo-ph

The influence of data gaps and outliers on resilience indicators

Pith reviewed 2026-05-19 13:51 UTC · model grok-4.3

classification 🌊 nlin.AO nlin.CDphysics.data-anphysics.geo-ph
keywords resilience indicatorscritical slowing downdata gapsoutlierstemporal autocorrelationvariancetime series analysisEarth system resilience
0
0 comments X

The pith

The agreement between variance- and autocorrelation-based resilience indicators is driven by the time series' initial data point.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines how missing values and outliers affect indicators that track resilience through variance and autocorrelation in time series data. It reveals that the two indicators agree largely because of the starting value in the record rather than deeper system properties. Missing values reduce how consistently the indicators align, and outliers systematically push autocorrelation measures to suggest higher resilience than may be present. These effects matter for efforts to detect critical slowing down before abrupt shifts in Earth systems such as climate or ecosystems. The work supplies a basis for better data preprocessing and for judging how trustworthy the indicators are when applied to imperfect real-world records.

Core claim

The paper establishes a rigorous mathematical analysis showing that the statistical dependency between variance-based and autocorrelation-based resilience indicators is fundamentally driven by the time series' initial data point. Using both synthetic and empirical data, it demonstrates that missing values substantially weaken indicator agreement while outliers introduce systematic biases that lead to overestimation of resilience based on temporal autocorrelation.

What carries the argument

The statistical dependency between variance- and autocorrelation-based resilience indicators, shown to be driven by the initial data point of the time series.

If this is right

  • Preprocessing must address missing values to preserve agreement between the indicators.
  • Outliers produce a consistent upward bias in resilience estimates drawn from autocorrelation.
  • Accuracy checks for resilience inferences from real data must incorporate effects of gaps and outliers.
  • The results supply a foundation for improved preprocessing across fields that apply these indicators to observational records.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Applying the same checks to other early-warning methods that rely on similar statistics could expose comparable sensitivities to data quality.
  • Testing the initial-point dependence on longer or multivariate records might clarify how the effect scales with series length.
  • These biases could be mitigated by explicit correction terms derived from the initial value, offering a direct way to adjust existing indicator calculations.

Load-bearing premise

The analysis assumes the underlying time series are generated from processes whose statistical properties such as stationarity and noise structure are known.

What would settle it

A collection of time series in which indicator agreement shows no dependence on the initial data point, or in which missing values do not reduce agreement, would challenge the central claims.

read the original abstract

The resilience, or stability, of major Earth system components is increasingly threatened by anthropogenic pressures, demanding reliable early warning signals for abrupt and irreversible regime shifts. Widely used data-driven resilience indicators based on variance and autocorrelation detect `critical slowing down', a signature of decreasing stability. However, the interpretation of these indicators is hampered by poorly understood interdependencies and their susceptibility to common data issues such as missing values and outliers. Here, we establish a rigorous mathematical analysis of the statistical dependency between variance- and autocorrelation-based resilience indicators, revealing that their agreement is fundamentally driven by the time series' initial data point. Using synthetic and empirical data, we demonstrate that missing values substantially weaken indicator agreement, while outliers introduce systematic biases that lead to overestimation of resilience based on temporal autocorrelation. Our results provide a necessary and rigorous foundation for preprocessing strategies and accuracy assessments across the growing number of disciplines that use real-world data to infer changes in system resilience.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript derives algebraically that agreement between variance- and autocorrelation-based resilience indicators is fundamentally driven by the time series initial data point. Using synthetic series from stationary processes and empirical records from public datasets, it shows that missing values weaken indicator agreement while outliers systematically bias autocorrelation-based estimates upward, overestimating resilience. The results are offered as a foundation for preprocessing strategies and accuracy assessments when applying critical slowing down indicators to imperfect real-world data.

Significance. If the algebraic dependency is general and the quantified effects of gaps and outliers prove robust, the work supplies a needed rigorous basis for interpreting interdependencies among widely used resilience indicators. The clean algebraic demonstration of the initial-point dependency and the reliance on standard public datasets are explicit strengths. The findings would improve reliability of early-warning applications in Earth-system science and related fields that routinely encounter data gaps and outliers.

major comments (2)
  1. [§4.1] §4.1 (Synthetic Data Experiments): The reported magnitudes of weakened agreement due to missing values and the upward bias in autocorrelation from outliers are quantified exclusively on AR(1)-type processes with fixed stationarity and noise parameters. Because the skeptic concern is valid—the true data-generating process may contain unmodeled trends, long-memory correlations, or state-dependent noise—the direction and size of these effects could change, weakening support for the advocated preprocessing strategies.
  2. [§5] §5 (Empirical Illustrations): The interpretation of the empirical records assumes the same statistical properties used to generate the synthetic data. Without explicit sensitivity checks or alternative process classes, the claim that the observed weakening and overestimation generalize remains conditional on those assumptions.
minor comments (2)
  1. Abstract: the phrase 'rigorous mathematical analysis' would be strengthened by a one-sentence reference to the specific initial-point dependency that is derived.
  2. [Figures 2-3] Figure 2 and 3 captions: variability across realizations or explicit error bars on the reported bias magnitudes should be shown to allow readers to judge robustness.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful review and constructive suggestions. We address the major comments point by point below, providing clarifications on the generality of our algebraic results and indicating revisions to enhance robustness.

read point-by-point responses
  1. Referee: [§4.1] §4.1 (Synthetic Data Experiments): The reported magnitudes of weakened agreement due to missing values and the upward bias in autocorrelation from outliers are quantified exclusively on AR(1)-type processes with fixed stationarity and noise parameters. Because the skeptic concern is valid—the true data-generating process may contain unmodeled trends, long-memory correlations, or state-dependent noise—the direction and size of these effects could change, weakening support for the advocated preprocessing strategies.

    Authors: The algebraic derivation establishing that indicator agreement is driven by the initial data point is independent of the underlying data-generating process and applies to any time series. Our synthetic experiments employ AR(1) processes because they represent the standard linear approximation for critical slowing down near tipping points, which is the regime where resilience indicators are most relevant. We agree that exploring other processes could provide additional insight into the magnitude of effects. In the revised manuscript, we will add a sensitivity analysis using an ARMA(1,1) process and a brief discussion of how trends might interact with the observed biases. revision: partial

  2. Referee: [§5] §5 (Empirical Illustrations): The interpretation of the empirical records assumes the same statistical properties used to generate the synthetic data. Without explicit sensitivity checks or alternative process classes, the claim that the observed weakening and overestimation generalize remains conditional on those assumptions.

    Authors: The empirical section uses real-world datasets from public repositories that are representative of applications in Earth system science. The weakening of agreement due to gaps and the upward bias from outliers are demonstrated directly on these records, consistent with the general algebraic framework. To strengthen the generalization, we will incorporate sensitivity checks by repeating the analysis on detrended versions and on subsets of the data to assess robustness to potential non-stationarities. revision: yes

Circularity Check

0 steps flagged

Core mathematical result is a direct algebraic derivation independent of inputs or self-citations

full rationale

The paper's central claim rests on a rigorous mathematical analysis establishing that agreement between variance- and autocorrelation-based indicators is driven by the initial data point. This follows directly from the statistical definitions of the indicators without any reduction to fitted parameters, self-citations, or ansatzes. Subsequent demonstrations on synthetic data (generated under explicit stationarity assumptions) and empirical records use standard public datasets and do not close the argument via author-overlapping citations or renaming of known results. No load-bearing step in the derivation chain is equivalent to its inputs by construction, making the analysis self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The paper relies on standard assumptions about time-series stationarity and noise models for the synthetic experiments, plus the choice of specific gap and outlier insertion mechanisms; no new physical entities are postulated.

free parameters (1)
  • synthetic process parameters (e.g., noise amplitude, autocorrelation length)
    Chosen to generate representative critical-slowing-down trajectories; values are not fitted to the target resilience result but define the testbed.
axioms (1)
  • domain assumption The time series are realizations of a stochastic process whose variance and lag-1 autocorrelation can be computed directly from the data without additional model fitting.
    Invoked when defining the resilience indicators and when deriving their statistical dependence.

pith-pipeline@v0.9.0 · 5715 in / 1351 out tokens · 49838 ms · 2026-05-19T13:51:48.281384+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

66 extracted references · 66 canonical work pages

  1. [1]

    Peterson, G., Allen, C. R. & Holling, C. S. Ecological resilience, biodiversity, and scale. Ecosystems1, 6–18 (1998)

  2. [2]

    Geophys.1–37 (2024)

    Bathiany, S.et al.Ecosystem resilience monitoring and early warning using Earth observation data: challenges and outlook.Surv. Geophys.1–37 (2024)

  3. [3]

    A., Folke, C

    Scheffer, M., Carpenter, S., Foley, J. A., Folke, C. & Walker, B. Catastrophic shifts in ecosystems. Nature413, 591–596 (2001)

  4. [4]

    & Stocker, T

    Boers, N., Ghil, M. & Stocker, T. F. Theoretical and paleoclimatic evidence for abrupt transitions in the Earth system.Environ. Res. Lett.17, 093006 (2022)

  5. [5]

    Battiston, S.et al.Complexity theory and financial regulation.Science351, 818–819 (2016)

  6. [6]

    & Barab´ asi, A.-L

    Gao, J., Barzel, B. & Barab´ asi, A.-L. Universal resilience patterns in complex networks.Nature 530, 307–312 (2016)

  7. [7]

    Folke, C.et al.Regime shifts, resilience, and biodiversity in ecosystem management.Annu. Rev. Ecol. Evol. Syst.35, 557–581 (2004)

  8. [8]

    M.et al.Remotely sensing potential climate change tipping points across scales.Nat

    Lenton, T. M.et al.Remotely sensing potential climate change tipping points across scales.Nat. Commun.15, 343 (2024)

  9. [9]

    Rockstr¨ om, J.et al.We need biosphere stewardship that protects carbon sinks and builds resilience.Proc. Natl. Acad. Sci.118, e2115218118 (2021)

  10. [10]

    Moore, J. W. & Schindler, D. E. Getting ahead of climate change for ecological adaptation and resilience.Science376, 1421–1426 (2022)

  11. [11]

    G., Ramdane, A

    Forzieri, G., Dakos, V., McDowell, N. G., Ramdane, A. & Cescatti, A. Emerging signals of declining forest resilience under climate change.Nature608, 534–539 (2022)

  12. [12]

    Boers, N., Marwan, N., Barbosa, H. M. & Kurths, J. A deforestation-induced tipping point for the South American monsoon system.Sci. Rep.7, 41489 (2017). 16

  13. [13]

    & Boers, N

    Bathiany, S., Nian, D., Dr¨ uke, M. & Boers, N. Resilience Indicators for Tropical Rainforests in a Dynamic Vegetation Model.Global Change Biol.30, e17613 (2024)

  14. [14]

    L.et al.Spatial correlation increase in single-sensor satellite data reveals loss of Amazon rainforest resilience.Earth’s Future12, e2023EF004040 (2024)

    Blaschke, L. L.et al.Spatial correlation increase in single-sensor satellite data reveals loss of Amazon rainforest resilience.Earth’s Future12, e2023EF004040 (2024)

  15. [15]

    C., Schleussner, C.-F., Barbosa, H

    Zemp, D. C., Schleussner, C.-F., Barbosa, H. d. M. J. & Rammig, A. Deforestation effects on Amazon forest resilience.Geophys. Res. Lett.44, 6182–6190 (2017)

  16. [16]

    Lovejoy, T. E. & Nobre, C. Amazon tipping point.Sci. Adv.4, eaat2340 (2018)

  17. [17]

    & Boers, N

    Bochow, N. & Boers, N. The South American monsoon approaches a critical transition in response to deforestation.Sci. Adv.9, eadd9973 (2023)

  18. [18]

    M.et al.Critical transitions in the Amazon forest system.Nature626, 555–564 (2024)

    Flores, B. M.et al.Critical transitions in the Amazon forest system.Nature626, 555–564 (2024)

  19. [19]

    Pimm, S. L. The complexity and stability of ecosystems.Nature307, 321–326 (1984)

  20. [20]

    M.et al.A resilience sensing system for the biosphere.Philos

    Lenton, T. M.et al.A resilience sensing system for the biosphere.Philos. Trans. R. Soc. B377, 20210383 (2022)

  21. [21]

    & K´ efi, S

    Dakos, V. & K´ efi, S. Ecological resilience: what to measure and how.Environ. Res. Lett.17, 043003 (2022)

  22. [22]

    The fluctuation-dissipation theorem.Rep

    Kubo, R. The fluctuation-dissipation theorem.Rep. Prog. Phys.29, 255 (1966)

  23. [23]

    & Boers, N

    Smith, T., Traxl, D. & Boers, N. Empirical evidence for recent global shifts in vegetation resilience.Nat. Clim. Change12, 477–484 (2022)

  24. [24]

    Carpenter, S. R. & Brock, W. A. Rising variance: a leading indicator of ecological transition. Ecol. Lett.9, 311–318 (2006)

  25. [25]

    & Kleinen, T

    Held, H. & Kleinen, T. Detection of climate system bifurcations by degenerate fingerprinting. Geophys. Res. Lett.31(2004)

  26. [26]

    Scheffer, M.et al.Early-warning signals for critical transitions.Nature461, 53–59 (2009)

  27. [27]

    H., d’Odorico, P

    Dakos, V., Van Nes, E. H., d’Odorico, P. & Scheffer, M. Robustness of variance and autocorrelation as indicators of critical slowing down.Ecology93, 264–271 (2012)

  28. [28]

    A., Lenton, T

    Boulton, C. A., Lenton, T. M. & Boers, N. Pronounced loss of Amazon rainforest resilience since the early 2000s.Nat. Clim. Change12, 271–278 (2022)

  29. [29]

    Observation-based early-warning signals for a collapse of the Atlantic Meridional Overturning Circulation.Nat

    Boers, N. Observation-based early-warning signals for a collapse of the Atlantic Meridional Overturning Circulation.Nat. Clim. Change11, 680–688 (2021)

  30. [30]

    Liu, T.et al.Teleconnections among tipping elements in the Earth system.Nat. Clim. Change 13, 67–74 (2023)

  31. [31]

    & Boers, N

    Ben-Yami, M., Skiba, V., Bathiany, S. & Boers, N. Uncertainties in critical slowing down indi- cators of observation-based fingerprints of the Atlantic Overturning Circulation.Nat. Commun. 14, 8344 (2023)

  32. [32]

    Early-warning signals for Dansgaard-Oeschger events in a high-resolution ice core record.Nat

    Boers, N. Early-warning signals for Dansgaard-Oeschger events in a high-resolution ice core record.Nat. Commun.9, 2556 (2018)

  33. [33]

    & Rypdal, M

    Boers, N. & Rypdal, M. Critical slowing down suggests that the western Greenland Ice Sheet is close to a tipping point.Proc. Natl. Acad. Sci.118, e2024192118 (2021)

  34. [34]

    Dakos, V.et al.Slowing down as an early warning signal for abrupt climate change.Proc. Natl. Acad. Sci.105, 14308–14312 (2008). 17

  35. [35]

    Wagner, T. J. & Eisenman, I. False alarms: How early warning signals falsely predict abrupt sea ice loss.Geophys. Res. Lett.42, 10–333 (2015)

  36. [36]

    Dyn.14, 173–183 (2023)

    Smith, T.et al.Reliability of resilience estimation based on multi-instrument time series.Earth Syst. Dyn.14, 173–183 (2023)

  37. [37]

    & Boers, N

    Ben-Yami, M., Morr, A., Bathiany, S. & Boers, N. Uncertainties too large to predict tipping times of major earth system components from historical data.Sci. Adv.10, eadl4841 (2024)

  38. [38]

    & Boers, N

    Boettner, C. & Boers, N. Critical slowing down in dynamical systems driven by nonstationary correlated noise.Phys. Rev. Research4, 013230 (2022)

  39. [39]

    & Boers, N

    Morr, A. & Boers, N. Detection of approaching critical transitions in natural systems driven by red noise.Phys. Rev. X14, 021037 (2024)

  40. [40]

    & Boers, N

    Smith, T. & Boers, N. Reliability of vegetation resilience estimates depends on biomass density. Nat. Ecol. Evol.7, 1799–1808 (2023)

  41. [41]

    Ditlevsen, P. D. & Johnsen, S. J. Tipping points: early warning and wishful thinking.Geophys. Res. Lett.37(2010)

  42. [42]

    & Xiang, Y

    Chen, H. & Xiang, Y. The Accelerating Loss of Resilience in Suburban Woodlands Can Largely Be Attributed to the Changes in Urban Precipitation Patterns.Glob. Change Biol.30, e17548 (2024)

  43. [43]

    & Boers, N

    Bochow, N., Poltronieri, A., Rypdal, M. & Boers, N. Reconstructing historical climate fields with deep learning.Sci. Adv.11, eadp0558 (2025)

  44. [44]

    & Lacaze, R

    Weiss, M., Baret, F., Garrigues, S. & Lacaze, R. LAI and fAPAR CYCLOPES global prod- ucts derived from VEGETATION. Part 2: Validation and comparison with MODIS collection 4 products.Remote Sens. Environ.110, 317–331 (2007)

  45. [45]

    & Weiss, M

    Kandasamy, S., Baret, F., Verger, A., Neveux, P. & Weiss, M. A comparison of methods for smoothing and gap filling time series of remote sensing observations–application to MODIS LAI products.Biogeosciences10, 4055–4071 (2013)

  46. [46]

    V., Arnold, S

    Spracklen, D. V., Arnold, S. R. & Taylor, C. M. Observations of increased tropical rainfall preceded by air passage over forests.Nature489, 282–285 (2012)

  47. [47]

    Duveiller, G.et al.Revealing the widespread potential of forests to increase low level cloud cover. Nat. Commun.12, 4337 (2021)

  48. [48]

    & Zheng, M

    Zhang, J., Roy, D., Devadiga, S. & Zheng, M. Anomaly detection in MODIS land products via time series analysis.Geo-Spat. Inf. Sci.10, 44–50 (2007)

  49. [49]

    A.et al.Cloud detection with MODIS

    Frey, R. A.et al.Cloud detection with MODIS. Part I: Improvements in the MODIS cloud mask for collection 5.J. Atmos. Ocean. Technol.25, 1057–1072 (2008)

  50. [50]

    & Qiu, G

    Matsushita, B., Yang, W., Chen, J., Onda, Y. & Qiu, G. Sensitivity of the enhanced vegetation index (EVI) and normalized difference vegetation index (NDVI) to topographic effects: a case study in high-density cypress forest.Sensors-basel.7, 2636–2651 (2007)

  51. [51]

    A.et al.Effects of outliers on remote sensing-assisted forest biomass estimation: A case study from the United States national forest inventory.Methods Ecol

    Knott, J. A.et al.Effects of outliers on remote sensing-assisted forest biomass estimation: A case study from the United States national forest inventory.Methods Ecol. Evol.14, 1587–1602 (2023)

  52. [52]

    Livesey, J. H. Kurtosis provides a good omnibus test for outliers in small samples.Clin. Biochem. 40, 1032–1036 (2007)

  53. [53]

    Westfall, P. H. Kurtosis as peakedness, 1905–2014. RIP.Amer. Statist.68, 191–195 (2014). 18

  54. [54]

    Verbesselt, J.et al.Remotely sensed resilience of tropical forests.Nat. Clim. Change6, 1028–1031 (2016)

  55. [55]

    Wu, J.et al.Alteration of wetland resilience for the intermittently and permanently inundated wetland.Environ. Res. Lett.19, 124077 (2024)

  56. [56]

    Dev.(2025)

    Liu, Y.et al.Accelerated Decline in Vegetation Resilience on the Tibetan Plateau.Land Degrad. Dev.(2025)

  57. [57]

    Morr, A., Riechers, K., Gorj˜ ao, L. R. & Boers, N. Anticipating critical transitions in multidi- mensional systems driven by time-and state-dependent noise.Phys. Rev. Research6, 033251 (2024)

  58. [58]

    MODIS/Terra Vegetation Indices 16-day L3 Global 250m Sin Grid V061 (2021)

    Didan, K. MODIS/Terra Vegetation Indices 16-day L3 Global 250m Sin Grid V061 (2021)

  59. [59]

    & Zhao, M

    Running, S., Mu, Q. & Zhao, M. MOD17A2H MODIS/Terra Gross Primary Productivity 8-day L4 Global 500m Sin Grid V006 (2015)

  60. [60]

    & Park, T

    Myneni, R., Knyazikhin, Y. & Park, T. MODIS/Terra+Aqua Leaf Area Index/FPAR 4-day L4 Global 500m Sin Grid V061 (2021)

  61. [61]

    Adv.7, eabc7447 (2021)

    Camps-Valls, G.et al.A unified vegetation index for quantifying the terrestrial biosphere.Sci. Adv.7, eabc7447 (2021)

  62. [62]

    Environ.202, 18–27 (2017)

    Gorelick, N.et al.Google Earth Engine: Planetary-scale geospatial analysis for everyone.Remote Sens. Environ.202, 18–27 (2017)

  63. [63]

    & Sulla-Menashe, D

    Friedl, M. & Sulla-Menashe, D. MCD12C1 MODIS/Terra+Aqua Land Cover Type Yearly L3 Global 500m Sin Grid V061 (2022)

  64. [64]

    M., Abercrombie, S

    Sulla-Menashe, D., Gray, J. M., Abercrombie, S. P. & Friedl, M. A. Hierarchical mapping of annual global land cover 2001 to present: The MODIS Collection 6 Land Cover product.Remote Sens. Environ.222, 183–194 (2019)

  65. [65]

    A., Sullivan, C

    Spawn, S. A., Sullivan, C. C., Lark, T. J. & Gibbs, H. K. Harmonized global maps of above and belowground biomass carbon density in the year 2010.Sci. Data7, 112 (2020)

  66. [66]

    B., Cleveland, W

    Cleveland, R. B., Cleveland, W. S., McRae, J. E., Terpenning, I.et al.STL: A seasonal-trend decomposition.J. Off. Stat.6, 3–73 (1990). 19