pith. machine review for the scientific record. sign in

arxiv: 2605.13689 · v1 · submitted 2026-05-13 · 📊 stat.ME

Recognition: unknown

Moving beyond spatial and random cross-validation in environmental modelling: a call for prediction-domain adaptive evaluation

Authors on Pith no claims yet

Pith reviewed 2026-05-14 17:49 UTC · model grok-4.3

classification 📊 stat.ME
keywords cross-validationspatial modellingenvironmental modellingprediction accuracymap evaluationinterpolationextrapolation
0
0 comments X

The pith

Prediction-domain adaptive cross-validation provides reliable accuracy estimates across interpolation and extrapolation scenarios.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that standard random and spatial cross-validation methods have limitations because they are suited only to specific extremes of the prediction domain. It proposes a new category called prediction-domain adaptive evaluation that flexibly adapts the evaluation to the particular prediction situation. This is important because most real-world cases in environmental modeling fall between random sampling and full extrapolation, leading to unreliable accuracy estimates otherwise. Sympathetic readers would care as it affects model tuning and the trustworthiness of spatial predictions in ecology. The authors support this by reproducing a simulation study to compare different methods.

Core claim

We advocate for prediction-domain adaptive evaluation as a new category of cross-validation methods that flexibly adapt to the prediction situation, yielding most reliable estimates of map accuracy across different scenarios.

What carries the argument

Prediction-domain adaptive evaluation methods, which adjust cross-validation to the specific prediction domain instead of using fixed random or spatial approaches.

If this is right

  • Random cross-validation is suitable when training points are randomly distributed in the prediction area.
  • Spatial cross-validation performs better in extrapolation situations.
  • Most cases lie on a continuum between these two extremes.
  • Adaptive methods can give the most reliable accuracy estimates across scenarios.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Developing concrete algorithms for this adaptive category could improve practical model evaluation in environmental science.
  • This call might lead to new research on estimating the appropriate adaptation based on data distribution.
  • Reproducing simulation studies provides a way to empirically test and refine such methods.

Load-bearing premise

Practical methods in the prediction-domain adaptive category can be developed that outperform random and spatial cross-validation in providing reliable accuracy estimates.

What would settle it

A study demonstrating that adaptive methods do not provide superior or more reliable accuracy estimates compared to existing methods across the continuum would falsify the proposal.

Figures

Figures reproduced from arXiv: 2605.13689 by Hanna Meyer, Jakub Nowosad, Jan Linnenbrink.

Figure 1
Figure 1. Figure 1: The upper row (A) shows sampling designs ranging from randomly distributed training point (left), through [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: In our simulation study, we generated four training sampling designs (random, biased, clustered, and [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
read the original abstract

With the growing application of spatial predictive modeling in ecology, the question of how to appropriately evaluate the resulting maps has gained increasing attention. While there is consensus that map accuracy is ideally estimated using an independent probability sample of the prediction area, there is still no agreement on the most appropriate way to conduct an evaluation for the common case when such a sample is not available. Cross-validation, which involves multiple train-test splits, is commonly applied not only to estimate final model accuracy but also to guide model tuning and selection. Many different spatial and non-spatial approaches to cross-validation have been proposed, and approaches in both groups have faced substantial criticism. It has been shown that random cross-validation methods are suitable when the training points are randomly distributed in the prediction area, while spatial cross-validation is better suited towards extrapolation situations. In practice, however, there is a continuum and most cases are between those two extremes. To address this gap, we advocate for a new category of cross-validation methods to account for this: prediction-domain adaptive evaluation. Methods in this category flexibly adapt to the prediction situation, yielding most reliable estimates of map accuracy across different scenarios. To ground this perspective empirically, we reproduce a simulation study that was used in earlier research and systematically compare different evaluation methods and discuss their purpose.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The manuscript argues that random cross-validation suits interpolation while spatial cross-validation suits extrapolation, but most real-world cases lie on a continuum between these extremes. It therefore advocates a new category of 'prediction-domain adaptive evaluation' methods that flexibly adapt to the prediction situation to yield more reliable map-accuracy estimates. The perspective is grounded by reproducing an earlier simulation study that systematically compares existing evaluation methods.

Significance. If concrete, implementable methods belonging to the proposed prediction-domain adaptive category can be developed and shown to reduce error in accuracy estimation relative to random or spatial cross-validation across the interpolation-extrapolation continuum, the work would meaningfully advance model-evaluation practice in spatial environmental modeling. The reproduction of the prior simulation study supplies a useful empirical anchor for the discussion and highlights the practical limitations of current approaches.

major comments (3)
  1. [Abstract] Abstract: the claim that prediction-domain adaptive methods 'flexibly adapt to the prediction situation, yielding most reliable estimates of map accuracy across different scenarios' is presented as the central recommendation, yet the manuscript introduces no new algorithm, pseudocode, or quantitative comparison demonstrating superiority of any member of this category.
  2. [Simulation study section] Reproduced simulation study: while the study compares known random and spatial cross-validation methods, it contains no member of the advocated prediction-domain adaptive category, so the manuscript provides no direct empirical evidence that such methods outperform existing ones on the continuum.
  3. [Discussion] Discussion: the call for prediction-domain adaptive evaluation rests on the untested premise that practical methods in this category can be constructed and will demonstrably improve accuracy estimation; no feasibility argument, example implementation, or falsifiable prediction is supplied to support this premise.
minor comments (1)
  1. [Abstract] The term 'prediction-domain adaptive evaluation' is introduced without a concise operational definition; adding one sentence that distinguishes it from both random and spatial CV would improve clarity for readers.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback and for highlighting the need for clearer distinctions between advocacy and empirical demonstration in our perspective. We address each major comment below, proposing targeted revisions to better align the manuscript's claims with its scope as a call for future methodological development rather than a presentation of new methods.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that prediction-domain adaptive methods 'flexibly adapt to the prediction situation, yielding most reliable estimates of map accuracy across different scenarios' is presented as the central recommendation, yet the manuscript introduces no new algorithm, pseudocode, or quantitative comparison demonstrating superiority of any member of this category.

    Authors: We agree that the abstract phrasing implies a stronger empirical foundation than the manuscript provides. The paper is a perspective advocating for the development of prediction-domain adaptive evaluation methods, not introducing or testing any specific member of that category. We will revise the abstract to frame the recommendation as a call for future research to create and validate such adaptive methods, grounded in the identified shortcomings of random and spatial cross-validation. revision: yes

  2. Referee: [Simulation study section] Reproduced simulation study: while the study compares known random and spatial cross-validation methods, it contains no member of the advocated prediction-domain adaptive category, so the manuscript provides no direct empirical evidence that such methods outperform existing ones on the continuum.

    Authors: The reproduced simulation study is included solely to demonstrate the limitations of existing random and spatial cross-validation approaches across interpolation-extrapolation scenarios, providing an empirical basis for why a new category is needed. It is not designed to test prediction-domain adaptive methods, as none are developed here. We will add explicit language in the simulation section to clarify its role as an anchor for the perspective rather than a comparative evaluation of the proposed category. revision: partial

  3. Referee: [Discussion] Discussion: the call for prediction-domain adaptive evaluation rests on the untested premise that practical methods in this category can be constructed and will demonstrably improve accuracy estimation; no feasibility argument, example implementation, or falsifiable prediction is supplied to support this premise.

    Authors: We acknowledge that the manuscript offers no concrete implementation or direct test of feasibility for prediction-domain adaptive methods. As a perspective piece, its primary aim is to identify the gap and encourage development of such methods. We will expand the discussion to include a brief outline of potential directions (e.g., adapting domain-adaptation or distance-weighted sampling techniques to the prediction domain) to provide initial feasibility grounding and falsifiable hypotheses for future work, without claiming to resolve the challenge. revision: yes

Circularity Check

0 steps flagged

No significant circularity; perspective paper advocates new category based on reproduced external simulation without self-referential derivations

full rationale

The manuscript contains no equations, fitted parameters, or derivation chain. It reproduces a simulation study from earlier external research and compares existing random and spatial cross-validation methods. The proposal for 'prediction-domain adaptive evaluation' is a conceptual category without any self-definition, fitted-input prediction, or load-bearing self-citation that reduces the central claim to its own inputs by construction. The argument relies on external literature and empirical comparison of known approaches, making it self-contained against the circularity criteria.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The claim depends on the domain assumption that adaptive methods are feasible and superior, plus the newly introduced conceptual category itself, which lacks independent evidence or implementation.

axioms (2)
  • domain assumption Independent probability samples of the prediction area are ideal for accuracy estimation but frequently unavailable
    Explicitly stated in the opening of the abstract.
  • domain assumption Real prediction situations form a continuum between random and spatial extremes
    Stated directly: 'in practice, however, there is a continuum and most cases are between those two extremes.'
invented entities (1)
  • prediction-domain adaptive evaluation no independent evidence
    purpose: A new category of cross-validation methods that flexibly adapt to the prediction situation
    Introduced in the abstract as the advocated solution; no specific algorithm or independent evidence provided.

pith-pipeline@v0.9.0 · 5534 in / 1366 out tokens · 58239 ms · 2026-05-14T17:49:23.597654+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 35 canonical work pages · 1 internal anchor

  1. [1]

    Nature Communications , author =

    Spatial validation reveals poor predictive performance of large-scale ecological mapping models , volume =. Nature Communications , author =. 2020 , pages =

  2. [2]

    Nature Communications , author =

    Machine learning-based global maps of ecological variables and the challenge of assessing them , volume =. Nature Communications , author =. 2022 , pages =

  3. [3]

    Methods in Ecology and Evolution , volume =

    Meyer, Hanna and Pebesma, Edzer , title =. Methods in Ecology and Evolution , volume =. doi:https://doi.org/10.1111/2041-210X.13650 , url =. https://besjournals.onlinelibrary.wiley.com/doi/pdf/10.1111/2041-210X.13650 , abstract =

  4. [4]

    Wadoux and Gerard B.M

    Alexandre M.J.-C. Wadoux and Gerard B.M. Heuvelink and Sytze. Spatial cross-validation is not the right way to evaluate map accuracy , journal =. 2021 , issn =. doi:https://doi.org/10.1016/j.ecolmodel.2021.109692 , url =

  5. [5]

    Brenning , title =

    A. Brenning , title =. 2012 IEEE International Geoscience and Remote Sensing Symposium , year =. doi:10.1109/IGARSS.2012.6352393 , file =

  6. [6]

    and Bahn, Volker and Ciuti, Simone and Boyce, Mark S

    Roberts, David R. and Bahn, Volker and Ciuti, Simone and Boyce, Mark S. and Elith, Jane and Guillera-Arroita, Gurutzeta and Hauenstein, Severin and Lahoz-Monfort, Jos\'es J. and Schr. Ecography , year =. doi:10.1111/ecog.02881 , publisher =

  7. [7]

    Ecological Modelling , year =

    Patrick Schratz and Jannes Muenchow and Eugenia Iturritxa and Jakob Richter and Alexander Brenning , title =. Ecological Modelling , year =. doi:https://doi.org/10.1016/j.ecolmodel.2019.06.002 , keywords =

  8. [8]

    Environmental Modelling & Software , year =

    Hanna Meyer and Christoph Reudenbach and Tomislav Hengl and Marwan Katurji and Thomas Nauss , title =. Environmental Modelling & Software , year =. doi:https://doi.org/10.1016/j.envsoft.2017.12.001 , keywords =

  9. [9]

    and Guillera-Arroita, Gurutzeta , title =

    Valavi, Roozbeh and Elith, Jane and Lahoz-Monfort, Jose J. and Guillera-Arroita, Gurutzeta , title =. bioRxiv , year =. doi:10.1101/357798 , eprint =

  10. [10]

    International Journal of Geographical Information Science , year =

    Jonne Pohjankukka and Tapio Pahikkala and Paavo Nevalainen and Jukka Heikkonen , title =. International Journal of Geographical Information Science , year =. doi:10.1080/13658816.2017.1346255 , eprint =

  11. [11]

    Methods in Ecology and Evolution , author =

    Nearest neighbour distance matching. Methods in Ecology and Evolution , author =. 2022 , note =. doi:10.1111/2041-210X.13851 , language =

  12. [12]

    Ecological Modelling , author =

    Importance of spatial predictor variable selection in machine learning applications –. Ecological Modelling , author =. 2019 , pages =. doi:10.1016/j.ecolmodel.2019.108815 , abstract =

  13. [13]

    and Mil\`a, C

    Linnenbrink, J. and Mil\`a, C. and Ludwig, M. and Meyer, H. , TITLE =. Geoscientific Model Development , VOLUME =. 2024 , NUMBER =

  14. [14]

    Journal of Statistical Software , author=

    mlr3spatiotempcv: Spatiotemporal Resampling Methods for Machine Learning in R , volume=. Journal of Statistical Software , author=. 2024 , pages=. doi:10.18637/jss.v111.i07 , number=

  15. [15]

    2023 , eprint =

    Assessing the performance of spatial cross-validation approaches for models of spatially structured data , author =. 2023 , eprint =. doi:10.48550/arXiv.2303.07334 , url =

  16. [16]

    2023 , url =

    Walid Ghariani , title =. 2023 , url =

  17. [17]

    Journal of Open Source Software , volume =

    Uieda, Leonardo , year =. Journal of Open Source Software , volume =

  18. [18]

    Nature Climate Change , author =

    Estimated carbon dioxide emissions from tropical deforestation improved by carbon-density maps , volume =. Nature Climate Change , author =. 2012 , pages =. doi:10.1038/nclimate1354 , language =

  19. [19]

    Ecological Informatics , author =

    Dealing with clustered samples for assessing map accuracy by cross-validation , volume =. Ecological Informatics , author =. 2022 , note =. doi:10.1016/j.ecoinf.2022.101665 , language =

  20. [20]

    International Journal of Applied Earth Observation and Geoinformation , author =

    Spatial+:. International Journal of Applied Earth Observation and Geoinformation , author =. 2023 , pages =. doi:10.1016/j.jag.2023.103364 , language =

  21. [21]

    Ecological Informatics , author =

    A dissimilarity-adaptive cross-validation method for evaluating geospatial machine learning predictions with clustered samples , volume =. Ecological Informatics , author =. 2025 , pages =. doi:10.1016/j.ecoinf.2025.103287 , language =

  22. [22]

    Global Ecology and Biogeography , volume =

    Ludwig, Marvin and Moreno-Martinez, Alvaro and Hölzel, Norbert and Pebesma, Edzer and Meyer, Hanna , title =. Global Ecology and Biogeography , volume =. doi:https://doi.org/10.1111/geb.13635 , url =. https://onlinelibrary.wiley.com/doi/pdf/10.1111/geb.13635 , abstract =

  23. [23]

    and Jandt, Ute and Jansen, Florian and Jiménez-Alfaro, Borja and Kattge, Jens and Levesley, Aurora and Pillar, Valério D

    Sabatini, Francesco Maria and Lenoir, Jonathan and Hattab, Tarek and Arnst, Elise Aimee and Chytrý, Milan and Dengler, Jürgen and De Ruffray, Patrice and Hennekens, Stephan M. and Jandt, Ute and Jansen, Florian and Jiménez-Alfaro, Borja and Kattge, Jens and Levesley, Aurora and Pillar, Valério D. and Purschke, Oliver and Sandel, Brody and Sultana, Fahmida...

  24. [24]

    Batjes, N. H. and Ribeiro, E. and van Oostrum, A. and Leenaars, J. and Hengl, T. and Mendes de Jesus, J. , TITLE =. Earth System Science Data , VOLUME =. 2017 , NUMBER =

  25. [25]

    Scientific Data , author =

    The. Scientific Data , author =. 2020 , pages =. doi:10.1038/s41597-020-0534-3 , abstract =

  26. [26]

    GBIF Home Page , year =

  27. [27]

    Nature , author =

    Soil nematode abundance and functional group composition at a global scale , volume =. Nature , author =. 2019 , pages =. doi:10.1038/s41586-019-1418-6 , language =

  28. [28]

    Science , author =

    The global tree restoration potential , volume =. Science , author =. 2019 , pages =. doi:10.1126/science.aax0848 , language =

  29. [29]

    New Phytologist , author =

    Global models and predictions of plant diversity based on advanced machine learning techniques , volume =. New Phytologist , author =. 2023 , pages =. doi:10.1111/nph.18533 , language =

  30. [30]

    Hastie, Trevor and Tibshirani, Robert and Friedman, Jerome , year =. The

  31. [31]

    International Journal of Remote Sensing , author =

    Sampling designs for accuracy assessment of land cover , volume =. International Journal of Remote Sensing , author =. 2009 , pages =. doi:10.1080/01431160903131000 , language =

  32. [32]

    Remote Sensing of Environment , author =

    Practical. Remote Sensing of Environment , author =. 2000 , pages =. doi:10.1016/S0034-4257(99)00090-5 , language =

  33. [33]

    and Orr, Michael C

    Hughes, Alice C. and Orr, Michael C. and Ma, Keping and Costello, Mark J. and Waller, John and Provoost, Pieter and Yang, Qinmin and Zhu, Chaodong and Qiao, Huijie , title =. Ecography , volume =. doi:https://doi.org/10.1111/ecog.05926 , url =. https://nsojournals.onlinelibrary.wiley.com/doi/pdf/10.1111/ecog.05926 , year =

  34. [34]

    European Journal of Soil Science , author =

    Sampling for validation of digital soil maps , volume =. European Journal of Soil Science , author =. 2011 , pages =. doi:10.1111/j.1365-2389.2011.01364.x , language =

  35. [35]

    2025 , note =

    CAST: 'caret' Applications for Spatial-Temporal Models , author =. 2025 , note =

  36. [36]

    Natural Hazards and Earth System Sciences , author =

    Spatial prediction models for landslide hazards: review, comparison and evaluation , volume =. Natural Hazards and Earth System Sciences , author =. 2005 , pages =. doi:10.5194/nhess-5-853-2005 , language =

  37. [37]

    Ecography , volume =

    Huang, Hongwei and Zhang, Zhixin and Bede-Fazekas, Ákos and Mammola, Stefano and Gu, Jiqi and Zhou, Jinxin and Qu, Junmei and Lin, Qiang , title =. Ecography , volume =. doi:https://doi.org/10.1111/ecog.07354 , year =

  38. [38]

    Ecography , author =

    Projecting spatiotemporal bioclimatic niche dynamics of endemic. Ecography , author =. 2026 , pages =. doi:10.1002/ecog.08067 , language =

  39. [39]

    Ecology and Evolution , author =

    Modeling. Ecology and Evolution , author =. 2026 , pages =. doi:10.1002/ece3.72031 , language =

  40. [40]

    Kuhn, Max and Johnson, Kjell , year =. Applied. doi:10.1007/978-1-4614-6849-3 , language =

  41. [41]

    Statistics for spatial data , isbn =

  42. [42]

    Brus and J.J

    D.J. Brus and J.J. Random sampling or geostatistical modelling? Choosing between design-based and model-based sampling strategies for soil (with discussion) , journal =. 1997 , issn =. doi:https://doi.org/10.1016/S0016-7061(97)00072-4 , url =

  43. [43]

    Nature Communications , author =

    Crowdsourced biodiversity monitoring fills gaps in global plant trait mapping , volume =. Nature Communications , author =. 2026 , pages =. doi:10.1038/s41467-026-68996-y , language =

  44. [44]

    Aligning Validation with Deployment: Target-Weighted Cross-Validation for Spatial Prediction

    Brenning, Alexander and Suesse, Thomas , year =. Aligning. doi:10.48550/ARXIV.2603.29981 , urldate =

  45. [45]

    Xue, Peipei and Minasny, Budiman and Román Dobarco, Mercedes and Wadoux, Alexandre M. J.-C. and Padarian Campusano, Jose and Bissett, Andrew and de Caritat, Patrice and McBratney, Alex , title =. Global Change Biology , volume =. doi:https://doi.org/10.1111/gcb.70268 , url =. https://onlinelibrary.wiley.com/doi/pdf/10.1111/gcb.70268 , note =