A Bayesian Updating Framework for Long-term Multi-Environment Trial Data in Plant Breeding
Pith reviewed 2026-05-10 07:45 UTC · model grok-4.3
The pith
Bayesian updating with historical windows stabilizes variance component estimates in multi-environment plant trials.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a Bayesian reformulation of the linear mixed model, combined with successive historical data windows to inform priors, maintains variance components as positive values and delivers realistic distributional estimates through MCMC sampling, with conjugate prior and posterior distributions belonging to the inverse gamma and inverse Wishart families, thereby allowing historical MET information to be integrated objectively for improved variance component estimation and subsequent experimental design.
What carries the argument
Successive historical data windows that supply conjugate priors for variance components in a Bayesian linear mixed model, with MCMC sampling from the resulting posteriors.
If this is right
- Variance components remain strictly positive and carry full distributional information rather than point estimates that can collapse to zero.
- Historical data are incorporated through an objective windowing procedure that updates priors for the current analysis.
- Posterior samples enable direct use in optimality criteria such as A-optimality for determining trial allocations to agro-ecological zones.
- The conjugacy of the inverse gamma and inverse Wishart families simplifies the Bayesian updating calculations.
Where Pith is reading between the lines
- The windowed updating scheme could be tested for robustness by varying window lengths and checking sensitivity of the resulting allocations.
- Improved variance estimates may allow more reliable extrapolation of genotype performance to future or untested environments within the target population.
- The framework might extend naturally to other accumulating longitudinal datasets where variance components need to be kept positive and informed by history.
Load-bearing premise
Historical MET data can be partitioned into successive windows that objectively inform priors without introducing bias or temporal mismatch between past and current environments.
What would settle it
If independent validation data show that variance components and trial allocations derived from the Bayesian-updated posteriors yield poorer predictive accuracy for genotype performance across environments than those from standard REML estimation, the practical advantage would be falsified.
Figures
read the original abstract
In variety testing, multi-environment trials (MET) are essential for evaluating the genotypic performance of crop plants. A persistent challenge in the statistical analysis of MET data is the estimation of variance components, which are often still inaccurately estimated or shrunk to exactly zero when using residual (restricted) maximum likelihood (REML) approaches. At the same time, institutions conducting MET typically possess extensive historical data that can, in principle, be leveraged to improve variance component estimation. However, these data are rarely incorporated sufficiently. The purpose of this paper is to address this gap by proposing a Bayesian framework that systematically integrates historical information to stabilize variance component estimation and better quantify uncertainty. Our Bayesian linear mixed model (BLMM) reformulation uses priors and Markov chain Monte Carlo (MCMC) methods to maintain the variance components as positive, yielding more realistic distributional estimates. Furthermore, our model incorporates historical prior information by managing MET data in successive historical data windows. Variance component prior and posterior distributions are shown to be conjugate and belong to the inverse gamma and inverse Wishart families. While Bayesian methodology is increasingly being used for analyzing MET data, to the best of our knowledge, this study comprises one of the first serious attempts to objectively inform priors in the context of MET data. This refers to the proposed Bayesian updating approach. To demonstrate the framework, we consider an application where posterior variance component samples are plugged into an A-optimality experimental design criterion to determine the average optimal allocations of trials to agro-ecological zones in a sub-divided target population of environments (TPE).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a Bayesian updating framework for long-term multi-environment trial (MET) data in plant breeding. It reformulates the linear mixed model using Bayesian methods with MCMC to ensure positive variance components, manages historical data in successive windows to set conjugate priors from the inverse-gamma and inverse-Wishart families, and demonstrates the approach by using posterior samples in an A-optimality criterion for allocating trials across agro-ecological zones in a target population of environments (TPE).
Significance. Should the conjugacy properties and unbiased updating hold, the framework offers a principled way to incorporate historical MET data into variance component estimation, potentially yielding more stable and realistic estimates than REML while properly quantifying uncertainty. This could have practical impact in plant breeding programs with extensive historical datasets.
major comments (2)
- [Methods / Model formulation] The central claim that variance component prior and posterior distributions are conjugate and belong to the inverse-gamma and inverse-Wishart families (Abstract) requires an explicit model specification and derivation in the methods section; without the linear mixed model equations and updating rules, it is impossible to verify whether the historical-window data produces the asserted conjugacy.
- [Framework description] The Bayesian updating scheme relies on partitioning MET data into successive historical windows to inform priors without temporal bias (Abstract and framework description). This assumption is load-bearing for the central claim; the manuscript does not address or test for non-stationarity in variance components (G, E, GxE) across windows due to climate trends or breeding progress, which risks miscalibrated posteriors.
minor comments (2)
- The application to A-optimality experimental design is mentioned but lacks any numerical results, simulation details, or real-data validation showing improved allocations.
- Clarify the objective criteria used to define and select the successive historical data windows, including any sensitivity checks.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive review. The comments highlight important areas for clarification and strengthening of the manuscript. We address each major comment below and outline the revisions we will implement.
read point-by-point responses
-
Referee: [Methods / Model formulation] The central claim that variance component prior and posterior distributions are conjugate and belong to the inverse-gamma and inverse-Wishart families (Abstract) requires an explicit model specification and derivation in the methods section; without the linear mixed model equations and updating rules, it is impossible to verify whether the historical-window data produces the asserted conjugacy.
Authors: We agree that the conjugacy claim requires a fully explicit derivation to allow verification. The current manuscript states the result but does not present the complete linear mixed model equations together with the prior-to-posterior updating rules for the inverse-gamma (error and genotype variances) and inverse-Wishart (genotype-by-environment covariance) distributions. In the revised version we will add a dedicated subsection in Methods that (i) writes the full BLMM, (ii) specifies the conjugate priors, and (iii) derives the closed-form posterior parameters after incorporating each successive historical window. This will make the conjugacy transparent and directly address the referee’s concern. revision: yes
-
Referee: [Framework description] The Bayesian updating scheme relies on partitioning MET data into successive historical windows to inform priors without temporal bias (Abstract and framework description). This assumption is load-bearing for the central claim; the manuscript does not address or test for non-stationarity in variance components (G, E, GxE) across windows due to climate trends or breeding progress, which risks miscalibrated posteriors.
Authors: We acknowledge that the stationarity assumption across historical windows is central and that the manuscript does not currently examine potential non-stationarity arising from climate trends or genetic progress. In the revision we will (i) explicitly state the stationarity assumption and the rationale for window length selection, (ii) add a short discussion of possible sources of non-stationarity, and (iii) include a limited sensitivity analysis (either on real data subsets or via simulation) that perturbs variance components across windows and reports the resulting change in posterior means and credible intervals. These additions will quantify the robustness of the updating procedure under mild departures from stationarity. revision: yes
Circularity Check
No significant circularity: historical windows supply external priors; conjugacy is algebraic under stated model
full rationale
The derivation uses successive historical MET data windows to construct conjugate inverse-gamma and inverse-Wishart priors for variance components in a Bayesian linear mixed model, then updates via MCMC. This is a standard Bayesian updating step in which earlier data serve as external input rather than being re-fitted or renamed as a prediction from the target data. No self-citation chain, self-definitional loop, or ansatz smuggling is present in the abstract or claimed framework; the conjugacy result follows directly from the chosen likelihood-prior pair and does not reduce the final posterior estimates to quantities already obtained by construction from the same observations. The approach remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Variance components in linear mixed models for MET data follow inverse-gamma or inverse-Wishart distributions that are conjugate to the likelihood.
- ad hoc to paper Historical MET data can be managed in successive windows to update priors without temporal bias.
Reference graph
Works this paper leans on
-
[1]
Alvarez, I., Niemi, J., & Simpson, M. (2014). Bayesian inference for a covariance matrix. Conference on Applied Statistics in Agriculture. https://doi.org/10.4148/2475- 7772.1004 Buntaran, H., Vasquez, A. M. B., Gordillo, A., Sahr, M., Wimmer, V., & Piepho, H. P. (2022). Assessing the response to genomic selection by simulation.Theoretical and Applied Gen...
-
[2]
https://doi.org/10.2134/agronj2016.07.0395 de Oliveira, L. A., da Silva, C. P., Nuvunga, J. J., & Balestre, M. (2016). Bayesian GGE biplot models applied to maize multi-environment trials.Genetics and Molecular Research,15(2). https://doi.org/10.4238/gmr.15028612 Frey, J., Hartung, J., Ogutu, J. O., & Piepho, H. P. (2024). Analyze as randomized – why drop...
-
[3]
https://doi.org/10.1007/s11032-015-0248-y Nuvunga, J. J., da Silva, C. P., de Oliveira, L. A., de Lima, R. R., & Balestre, M. (2019). Bayesian factor analytic model: An approach in multiple environment trials.PLoS ONE,14(8), e0220290. https://doi.org/10.1371/journal.pone.0220290 Patterson, H. D. (1997). Analysis of series of variety trials. In R. A. Kempt...
-
[4]
https://doi.org/10.1007/s00122-023-04260-x Rodr´ ıguez-´Alvarez, M. X., de Boer, M. P., van Eeuwijk, F. A., & Eilers, P. H. C. (2018). Correcting for spatial heterogeneity in plant breeding experiments with P-splines. Spatial Statistics,23, 52–71. https://doi.org/10.1016/j.spasta.2017.10.003 Rue, H., Martino, S., & Chopin, N. (2009). Approximate Bayesian ...
-
[5]
org/10.1186/s12711-024-00939-x Studnicki, M., & Piepho, H
https://doi. org/10.1186/s12711-024-00939-x Studnicki, M., & Piepho, H. P. (2024). Hierarchical modelling of variance components makes analysis of resolvable incomplete block designs more efficient.TAG. Theo- retical and Applied Genetics. Theoretische und angewandte Genetik,137(6),
-
[6]
https://doi.org/10.1007/s00122-024-04639-4 Tolhurst, D. J., Gaynor, R. C., Gardunia, B., Hickey, J. M., & Gorjanc, G. (2022). Genomic selection using random regressions on known and latent environmental covariates.Theoretical and Applied Genetics,135, 3393–3415. https://doi.org/10. 1007/s00122-022-04186-w Yan, Q., Fruzangohar, M., Taylor, J., Gong, D., Wa...
-
[7]
tr(SΣ−1 Z ) + MX g=1 b⊤ ˜k;gΣ−1 Z b˜k;g #! = det(ΣZ)−(ν+Z+M+1)/2 exp −1 2
https://doi.org/10.1186/ s13007-023-01073-3 Z´ u˜ niga, J. I. F., Arellano-Valle, R. B., & Ferrari, S. L. P. (2013). Mixed beta regression: A Bayesian perspective.Computational Statistics and Data Analysis,61, 137–147. https://doi.org/10.1016/j.csda.2012.12.002 18 6 Appendix We provided a Git webpage at GitHub 1 containing the BRRI MET data considered, ou...
-
[8]
The criteria Φ A(ξ) and Φ ⋆ A(ξ) are explicitly derived for the data structure provided by Kleinknecht et al. (2013) and its statistical models as specified by Prus and Piepho (2021) for equation (17) and by Prus and Piepho (2024) for equation (18). These criteria cannot be understood as general results usable for any MET data and need to be uniquely deri...
work page 2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.