pith. sign in

arxiv: 2606.13323 · v1 · pith:G7TUEHPNnew · submitted 2026-06-11 · 💻 cs.NE

Runtime Analysis of the (μ + 1)-ES in a Homogenous Progress Model

Pith reviewed 2026-06-27 04:59 UTC · model grok-4.3

classification 💻 cs.NE
keywords evolution strategieshomogeneous progress modelruntime analysis(μ+1)-ESgrowth rateGaussian distributionpopulation sizesteady-state analysis
0
0 comments X

The pith

In the homogeneous progress model the (μ+1)-ES grows at rate (log^{1+o(1)} μ / μ) times the single-parent rate for Gaussian fitness shifts when μ ≤ e^δ.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper defines a homogeneous progress model in which every mutation produces a fitness improvement drawn from a fixed distribution Z that never changes with current position. This model is applied to the steady-state (μ+1)-ES that keeps overlapping generations by selecting the best μ out of μ+1 candidates. The analysis proceeds by constructing two modified processes whose growth rates provably bracket the true rate, making the bounding processes tractable. For the concrete case Z equal to a normal distribution with mean minus δ, the expected growth rate R_μ satisfies R_μ equals (log to the power 1 plus little-o of μ divided by μ) times R_1 whenever μ is at most e to the power δ. The result supplies a precise population-size tradeoff for optimization regimes far from the global optimum.

Core claim

The paper establishes that in the homogeneous progress model defined by an invariant improvement distribution Z, the expected growth rate R_μ of the continuous steady-state (μ+1)-ES can be bounded by constructing simpler modified processes that sandwich the original process. When Z equals the normal distribution N(-δ, 1) and μ is at most e^δ, this technique yields the exact asymptotic relation R_μ = (log^{1 + o(1)} μ / μ) R_1.

What carries the argument

Modified processes that sandwich the growth rate of the original (μ+1)-ES, allowing the true rate to be bounded between two more tractable quantities.

If this is right

  • The growth rate of the (μ+1)-ES is asymptotically smaller than the (1+1)-ES rate by the factor log^{1+o(1)} μ divided by μ.
  • The bound holds only while μ stays at most e^δ for a Gaussian shift of size δ.
  • The model is intended for regimes far from the global optimum where exact fitness functions are intractable.
  • The sandwiching technique handles the dependencies created by overlapping generations in plus-selection.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar sandwiching arguments might yield growth-rate bounds for other selection rules or non-Gaussian Z distributions.
  • The scaling suggests that moderate population sizes could be preferable in practice once the logarithmic factor is taken into account.
  • The homogeneous model supplies a simple baseline for predicting behavior in black-box settings such as hyperparameter search.
  • The result could be tested by simulating the (μ+1)-ES on artificial landscapes whose improvement statistics are forced to remain stationary.

Load-bearing premise

The relative fitness improvement produced by any mutation is always drawn from the same fixed distribution Z no matter where the search currently stands.

What would settle it

Running an actual optimization problem and measuring whether the empirical distribution of fitness gains stays statistically constant across widely different fitness levels.

Figures

Figures reproduced from arXiv: 2606.13323 by Johannes Lengler, Raghu Raman Ravi.

Figure 1
Figure 1. Figure 1: Scaled steady-state growth rate Rµ/ξ(δ) of the noiseless (µ + 1)-ES across varying population sizes. The empirical dynamics collapse perfectly onto the theoretical idealised bound Hµ/µ, regardless of the penalty severity δ. Maximum for each curve is indicated by a triangle [PITH_FULL_IMAGE:figures/full_fig_p019_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Log-log plot of the steady-state growth rate vs. population size for δ = 5.0 under varying noise levels σ. While severe noise heavily penalizes small populations, larger populations naturally recover parallel asymptotic scaling. The results, plotted on a logarithmic scale in [PITH_FULL_IMAGE:figures/full_fig_p020_2.png] view at source ↗
read the original abstract

We introduce a new simple model to study the fitness progress of Evolution Strategies (ES) in generic problems. In this model, we bypass the underlying fitness landscape and assume that the mutation of any individual produces an offspring whose fitness relative to the parent is given by an invariant distribution $Z$, such as a mean-shifted Gaussian. This serves as a prototypical model for the optimisation landscape when an evolution algorithm operates far from the global optimum. This simple model can be used to approximate the optimisation process for problems where it is intractable to model the exact fitness function, including tasks such as hyperparameter tuning in machine learning models. We rigorously analyse the expected growth rate $\mathcal{R}_{\mu}$ of the continuous steady-state $(\mu+1)$-ES in this model. Unlike comma-selection strategies, the steady-state $(\mu+1)$-ES maintains overlapping generations, introducing complex mathematical dependencies among surviving parents that make it harder to analyse. We give a general technique to analyse the the $(\mu + 1)$-ES by constructing modified processes whose growth rates provably sandwich that of the original process. These modified processes are then easier to analyse but still close enough to the true process to give a tight bound on the expected growth rate. When $Z = \mathcal{N}(-\delta, 1)$ and $\mu \le e^{\delta}$, we show that $\mathcal{R}_{\mu} = \frac{\log^{1 + o(1)} \mu}{\mu} \mathcal{R}_1$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces the homogeneous progress model for Evolution Strategies, in which the relative fitness improvement of any mutation is drawn from an invariant distribution Z (e.g., a mean-shifted Gaussian) independent of current fitness or position. It analyzes the expected growth rate R_μ of the continuous steady-state (μ+1)-ES by constructing modified processes whose rates provably sandwich the original overlapping-generation process. For Z = N(-δ,1) and μ ≤ e^δ the paper claims the asymptotic R_μ = (log^{1+o(1)} μ / μ) R_1.

Significance. If the sandwiching argument and resulting asymptotic hold, the work supplies a rigorous, parameter-free scaling law for progress rates in a simplified model that approximates generic landscapes far from the optimum. The sandwiching technique itself is a methodological contribution for handling dependencies induced by overlapping generations. The model is positioned as a tool for intractable problems such as hyperparameter tuning.

major comments (2)
  1. [Abstract] Abstract: the central claim rests on a 'rigorous sandwiching argument' that yields the stated asymptotic together with explicit error terms and a treatment of the continuous steady-state; none of these derivations appear in sufficient detail to allow verification of the result beyond the high-level description.
  2. [Abstract] The modeling premise that Z is strictly invariant (independent of current fitness value or position) is definitional for the homogeneous progress model, yet the manuscript does not discuss the regime in which this approximation remains accurate or how deviations would affect the derived scaling.
minor comments (1)
  1. [Abstract] Notation: the symbol R_μ is introduced without an explicit equation defining it in terms of the steady-state distribution or expected improvement.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and constructive comments. We address each major point below and outline the revisions we intend to make to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim rests on a 'rigorous sandwiching argument' that yields the stated asymptotic together with explicit error terms and a treatment of the continuous steady-state; none of these derivations appear in sufficient detail to allow verification of the result beyond the high-level description.

    Authors: The full derivations of the sandwiching construction, the proof that the modified processes bound the original overlapping-generation process, the analysis of the continuous steady-state, and the explicit error terms supporting the asymptotic appear in Sections 3 and 4. The abstract follows the conventional high-level style. To improve verifiability we will add a concise proof outline to the introduction and ensure every error term is stated explicitly with its derivation reference. revision: partial

  2. Referee: [Abstract] The modeling premise that Z is strictly invariant (independent of current fitness value or position) is definitional for the homogeneous progress model, yet the manuscript does not discuss the regime in which this approximation remains accurate or how deviations would affect the derived scaling.

    Authors: We agree that an explicit discussion of the model's validity regime is useful. In the revised version we will insert a dedicated paragraph (or short subsection) after the model definition that identifies the distance-from-optimum regime in which invariance of Z is a reasonable approximation on typical smooth landscapes and that briefly indicates how moderate deviations from invariance would be expected to affect the leading scaling term. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained within the defined model

full rationale

The paper defines the homogeneous progress model explicitly as an invariant distribution Z for relative fitness improvements, independent of position. It then derives the growth rate bound R_μ via a sandwiching argument on auxiliary processes whose rates are analyzed directly from the model definition. No parameter is fitted to data and then renamed as a prediction; no self-citation chain is invoked as a load-bearing uniqueness theorem; the central claim reduces to standard probabilistic analysis of the explicitly stated process rather than to its own inputs by construction. The modeling premise is definitional, not a hidden assumption that collapses the derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The model rests on a single domain assumption that relative fitness progress is i.i.d. from a fixed distribution Z; no free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Mutation of any individual produces an offspring whose fitness relative to the parent is given by an invariant distribution Z.
    This is the defining modeling choice that bypasses the underlying fitness landscape.

pith-pipeline@v0.9.1-grok · 5803 in / 1219 out tokens · 24619 ms · 2026-06-27T04:59:23.041610+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

26 extracted references · 1 linked inside Pith

  1. [1]

    In: Proceedings of the Genetic and Evolutionary Computation Conference

    Akimoto, Y., Auger, A., Glasmachers, T.: Drift theory in continuous search spaces: Expected hitting time of the(1 + 1)-ES with 1/5 success rule. In: Proceedings of the Genetic and Evolutionary Computation Conference. pp. 801–808 (2018)

  2. [2]

    Theoretical Computer Science334(1-3), 35–69 (2005)

    Auger, A.: Convergence results for the(1, λ)-SA-ES using the theory ofϕ- irreducible Markov chains. Theoretical Computer Science334(1-3), 35–69 (2005)

  3. [3]

    Electronic Journal of Probability19(22), 1–17 (2014)

    Bérard, J., Maillard, P.: The limiting process of N-particle branching random walk with polynomial tails. Electronic Journal of Probability19(22), 1–17 (2014)

  4. [4]

    Springer Berlin Heidelberg (2001)

    Beyer, H.G.: The theory of Evolution Strategies. Springer Berlin Heidelberg (2001)

  5. [5]

    Physical Review E 73(5), 056126 (2006)

    Brunet, E., Derrida, B., Mueller, A.H., Munier, S.: Phenomenological theory giving the full statistics of the position of fluctuating pulled fronts. Physical Review E 73(5), 056126 (2006)

  6. [6]

    Physical Review E76(4), 041104 (2007)

    Brunet, E., Derrida, B., Mueller, A.H., Munier, S.: Effect of selection on ancestry: An exactly soluble case and its phenomenological generalization. Physical Review E76(4), 041104 (2007)

  7. [7]

    In: Proceedings of the Genetic and Evolutionary Computation Conference

    Dang, D.C., Lehre, P.K.: The SLO hierarchy of pseudo-Boolean functions and run- time of evolutionary algorithms. In: Proceedings of the Genetic and Evolutionary Computation Conference. pp. 1551–1559 (2024)

  8. [8]

    Springer Nature (2019)

    Doerr, B., Neumann, F.: Theory of evolutionary computation: Recent develop- ments in discrete optimization. Springer Nature (2019)

  9. [9]

    Gissler, A.: Linear convergence of Evolution Strategies with covariance matrix adaptation. Ph.D. thesis, Institut Polytechnique de Paris (2024) 22 J. Lengler, R. Ravi

  10. [10]

    Evolutionary Computation28(1), 27–53 (2020)

    Glasmachers, T.: Global convergence of the(1 + 1)Evolution Strategy to a critical point. Evolutionary Computation28(1), 27–53 (2020)

  11. [11]

    arXiv preprint arXiv:1604.00772 (2016)

    Hansen, N.: The CMA Evolution Strategy: A tutorial. arXiv preprint arXiv:1604.00772 (2016)

  12. [12]

    Evolutionary Computation9(2), 159–195 (2001)

    Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in Evolution Strategies. Evolutionary Computation9(2), 159–195 (2001)

  13. [13]

    Theoretical Computer Science361(1), 38–56 (2006)

    Jägersküpper, J.: How the(1 + 1)ES using isotropic mutations minimizes positive definite quadratic forms. Theoretical Computer Science361(1), 38–56 (2006)

  14. [14]

    In: Proceedings of the Genetic and Evolutionary Computation Conference

    Jägersküpper, J.: Probabilistic runtime analysis of(1+, λ)ES using isotropic muta- tions. In: Proceedings of the Genetic and Evolutionary Computation Conference. pp. 461–468 (2006)

  15. [15]

    In: Proceedings of the Genetic and Evolutionary Computation Conference

    Jägersküpper, J., Witt, C.: Rigorous runtime analysis of a(µ+ 1)ES for the sphere function. In: Proceedings of the Genetic and Evolutionary Computation Conference. pp. 849–856 (2005)

  16. [16]

    In: Proceedings of the International Conference on Intelligent Computing

    Jiang, W., Qian, C., Tang, K.: Improved running time analysis of the(1 + 1)-ES on the sphere function. In: Proceedings of the International Conference on Intelligent Computing. pp. 729–739. Springer (2018)

  17. [17]

    In: Theory of evolutionary computation: Recent devel- opments in discrete optimization, pp

    Lengler, J.: Drift analysis. In: Theory of evolutionary computation: Recent devel- opments in discrete optimization, pp. 89–131. Springer (2019)

  18. [18]

    IEEE Transactions on Evolutionary Computation24(6), 995–1009 (2019)

    Lengler,J.:Ageneraldichotomyofevolutionaryalgorithmsonmonotonefunctions. IEEE Transactions on Evolutionary Computation24(6), 995–1009 (2019)

  19. [19]

    In: Proceedings of the Evolutionary Computation in Combinatorial Optimization

    Lengler, J., Riedi, S.: Runtime analysis of the(µ+ 1)-EA on the dynamic Bin- Val function. In: Proceedings of the Evolutionary Computation in Combinatorial Optimization. pp. 84–99. Springer (2021)

  20. [20]

    In: Proceedings of the Foundations of Genetic Algorithms

    Lengler, J., Zou, X.: Exponential slowdown for larger populations: The(µ+ 1)-EA on monotone functions. In: Proceedings of the Foundations of Genetic Algorithms. pp. 87–101 (2019)

  21. [21]

    Tech- nometrics6(1), 101–102 (1964)

    Marsaglia, G.: Generating a variable from the tail of the normal distribution. Tech- nometrics6(1), 101–102 (1964)

  22. [22]

    IEEE Trans- actions on Evolutionary Computation28(2), 501–515 (2023)

    Morinaga, D., Fukuchi, K., Sakuma, J., Akimoto, Y.: Convergence rate of the (1 + 1)-ES on locally strongly convex and Lipschitz smooth functions. IEEE Trans- actions on Evolutionary Computation28(2), 501–515 (2023)

  23. [23]

    In: Proceedings of the Workshop on Simula- tionsmethoden in der Medizin und Biologie

    Rechenberg, I.: Evolutionsstrategien. In: Proceedings of the Workshop on Simula- tionsmethoden in der Medizin und Biologie. pp. 83–114. Springer (1978)

  24. [24]

    Biometrika41(1/2), 177–189 (1954)

    Shenton, L.: Inequalities for the normal integral including a new continued fraction. Biometrika41(1/2), 177–189 (1954)

  25. [25]

    Shi, Z., et al.: Branching random walks, vol. 2151. Springer (2015)

  26. [26]

    Evolutionary Computation14(1), 65–86 (2006)

    Witt, C.: Runtime analysis of the(µ+ 1)EA on simple pseudo-Boolean functions. Evolutionary Computation14(1), 65–86 (2006)