Prediction of neural network performance by phenotypic modeling

Adam Gaier; Alexander Hagg; J\"org Stork; Martin Zaefferer

arxiv: 1907.07075 · v1 · pith:FLMRBSPInew · submitted 2019-07-16 · 💻 cs.NE

Prediction of neural network performance by phenotypic modeling

Alexander Hagg , Martin Zaefferer , J\"org Stork , Adam Gaier This is my paper

Pith reviewed 2026-05-24 20:31 UTC · model grok-4.3

classification 💻 cs.NE

keywords surrogate modelsphenotypic distanceneuroevolutionvariable topologyperformance predictionrobotic navigationevolutionary optimization

0 comments

The pith

Phenotypic distance from output differences on shared inputs lets surrogate models predict neural network performance regardless of topology.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that differing neural networks can be embedded in a common space by measuring how their outputs diverge on the same input sequence, turning this phenotypic distance into input for surrogate models that estimate performance. This matters for neuroevolution and similar methods because genotypes change structure during search, destroying the fixed input space required for conventional data-driven modeling. The authors demonstrate the approach on a robotic navigation task, where phenotypic-embedding models perform as well as or better than models trained on the weight vectors of fixed-topology networks. A sympathetic reader would care because the method preserves the ability to use cheap surrogate evaluations even when the representation itself evolves.

Core claim

By feeding two networks the same input sequence and recording the difference in their output sequences, the phenotypic distance places networks of any topology into one shared space. Surrogate models trained on these distances can then predict objective values for unseen networks. In the robotic navigation task the resulting models match or exceed the accuracy of conventional surrogates that require identical weight-vector inputs from a fixed topology.

What carries the argument

Phenotypic distance: the scalar or vector difference between the output sequences produced by two networks on an identical input sequence; it supplies the coordinates that embed variable-topology networks into one modeling space.

If this is right

Surrogate-assisted optimization becomes feasible for any evolutionary method that mutates network structure.
The same embedding works for any controller or classifier domain in which input and output dimensions stay fixed even while internal connectivity changes.
Model training no longer requires a single fixed genotype length, removing a major barrier in neuroevolution.
Performance estimates can be obtained for candidate networks before they are evaluated on the expensive objective.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same phenotypic embedding could be applied to other variable-length representations such as genetic programming trees or variable-length strings.
If the distance metric proves smooth, it might allow gradient-based search in the phenotypic space itself rather than only in the original genotype space.
Because the embedding depends only on behavior, it might transfer across different tasks that share the same input-output interface.

Load-bearing premise

The distance between output sequences on a shared input sequence forms a space in which linear or other interpolation between observed networks accurately estimates performance on unseen topologies.

What would settle it

A direct test in the same robotic navigation task where performance predictions from the phenotypic model show no better correlation with true objective values than a random baseline when applied to networks whose topologies were not seen during model training.

Figures

Figures reproduced from arXiv: 1907.07075 by Adam Gaier, Alexander Hagg, J\"org Stork, Martin Zaefferer.

**Figure 2.** Figure 2: Sampling the phenotype to compare two individual [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Weight models are based on weight vectors for [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Evaluation takes place in a maze environment (a) [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Distance map generated by MAP-Elites (lower dis [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: The model quality in terms of correlation (x-axis), for linear and Kriging models and different input spaces, and [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: The number of linear model coefficients selected [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

read the original abstract

Surrogate models are used to reduce the burden of expensive-to-evaluate objective functions in optimization. By creating models which map genomes to objective values, these models can estimate the performance of unknown inputs, and so be used in place of expensive objective functions. Evolutionary techniques such as genetic programming or neuroevolution commonly alter the structure of the genome itself. A lack of consistency in the genotype is a fatal blow to data-driven modeling techniques: interpolation between points is impossible without a common input space. However, while the dimensionality of genotypes may differ across individuals, in many domains, such as controllers or classifiers, the dimensionality of the input and output remains constant. In this work we leverage this insight to embed differing neural networks into the same input space. To judge the difference between the behavior of two neural networks, we give them both the same input sequence, and examine the difference in output. This difference, the phenotypic distance, can then be used to situate these networks into a common input space, allowing us to produce surrogate models which can predict the performance of neural networks regardless of topology. In a robotic navigation task, we show that models trained using this phenotypic embedding perform as well or better as those trained on the weight values of a fixed topology neural network. We establish such phenotypic surrogate models as a promising and flexible approach which enables surrogate modeling even for representations that undergo structural changes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Phenotypic embedding via output differences lets surrogates handle variable topologies in neuroevolution, but the single-task result leaves generalization untested.

read the letter

The main takeaway is that this paper gives a practical way to build surrogate models for neuroevolution when network topologies are allowed to change. They embed networks into a shared space by computing the difference in outputs on a fixed input sequence, then train a model to predict performance from that distance instead of from the weights or structure directly. That sidesteps the variable-genotype problem that usually blocks interpolation-based surrogates.

Referee Report

2 major / 2 minor

Summary. The paper proposes embedding neural networks of varying topologies into a common space via phenotypic distance, defined as the difference in outputs produced by two networks on a shared input sequence. This embedding is then used to train surrogate models that predict network performance, enabling surrogate-assisted optimization even when genotypes change structure. On a robotic navigation task the authors report that phenotypic-embedding surrogates perform as well or better than weight-based surrogates trained on fixed-topology networks.

Significance. If the central empirical claim is substantiated, the work would provide a practical route to surrogate modeling in neuroevolution domains that permit topology variation, a setting where conventional fixed-length genotype surrogates cannot be applied directly. The approach is grounded in the observation that input/output dimensionality is often constant even when network structure is not, and the single-task comparison supplies an initial existence proof. No machine-checked proofs or parameter-free derivations are present, but the method is falsifiable via hold-out topology experiments.

major comments (2)

[Abstract and §4] Abstract and §4 (Experimental Results): the claim that phenotypic surrogates 'perform as well or better' rests on a single unreported experiment. No details are supplied on the input sequence used to compute distances, the precise distance metric, the surrogate training procedure, network-size controls, or statistical significance testing. These omissions make it impossible to judge whether the reported performance difference is attributable to the embedding or to confounding factors.
[§3] §3 (Phenotypic Embedding): no equations or pseudocode define the phenotypic distance or the subsequent surrogate regression. Without an explicit formulation it is unclear whether the distance correlates with the navigation objective or merely reflects output similarity on the chosen sequence; the abstract supplies no correlation plot or topology-hold-out validation that would confirm the embedding supports accurate interpolation for unseen topologies.

minor comments (2)

[Abstract] The abstract states the result but does not name the robotic navigation benchmark or the fixed-topology baseline architecture; adding these identifiers would improve reproducibility.
[§3] Notation for the phenotypic distance is introduced informally; a short equation or algorithm box would clarify how the distance is aggregated across the input sequence.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The comments highlight areas where the original submission lacked sufficient methodological detail and validation. We address each point below and will incorporate the requested clarifications and additional analyses into the revised manuscript.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (Experimental Results): the claim that phenotypic surrogates 'perform as well or better' rests on a single unreported experiment. No details are supplied on the input sequence used to compute distances, the precise distance metric, the surrogate training procedure, network-size controls, or statistical significance testing. These omissions make it impossible to judge whether the reported performance difference is attributable to the embedding or to confounding factors.

Authors: We agree that the original submission omitted critical experimental details, making independent assessment difficult. In the revision we will expand §4 with: (i) the exact input sequence (a 200-step recording of normalized sensor values from the robot's navigation environment), (ii) the distance metric (Euclidean norm on the concatenated output vectors), (iii) the surrogate procedure (Gaussian-process regression with automatic relevance determination), (iv) network-size controls (fixed-topology baselines matched for total parameter count), and (v) statistical testing (paired t-tests and effect-size reporting across 30 independent runs). These additions will allow readers to evaluate whether performance differences arise from the phenotypic embedding itself. revision: yes
Referee: [§3] §3 (Phenotypic Embedding): no equations or pseudocode define the phenotypic distance or the subsequent surrogate regression. Without an explicit formulation it is unclear whether the distance correlates with the navigation objective or merely reflects output similarity on the chosen sequence; the abstract supplies no correlation plot or topology-hold-out validation that would confirm the embedding supports accurate interpolation for unseen topologies.

Authors: We accept that the lack of formal definitions hindered clarity. The revised §3 will include: the phenotypic distance equation d(N_i, N_j) = ||f_i(X) - f_j(X)||_2 where X is the fixed input sequence and f denotes the network's output function; the surrogate regression formulation (kernel ridge regression or GP on the distance matrix); a scatter plot of phenotypic distance versus performance difference on the navigation task; and a topology hold-out experiment in which models are trained on one subset of topologies and evaluated on completely unseen topologies. These additions will demonstrate that the embedding supports interpolation beyond mere output similarity on the chosen sequence. revision: yes

Circularity Check

0 steps flagged

No circularity: phenotypic embedding is an independent behavioral metric with empirical validation

full rationale

The paper defines phenotypic distance externally as output differences on a shared input sequence and uses it to embed networks for surrogate modeling of performance. No equations, fitted parameters, or self-citations are shown that would make the reported performance predictions equivalent to the inputs by construction. The central result is an empirical comparison on one robotic task showing phenotypic surrogates match or exceed weight-based models; this does not reduce to a tautology or renaming of known results. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the untested domain assumption that output differences on a fixed input sequence form a useful similarity metric for performance prediction across topologies. No free parameters or invented entities are declared in the abstract.

axioms (1)

domain assumption Phenotypic distance defined by output difference on a shared input sequence yields a space suitable for interpolation and surrogate regression across differing network topologies.
Invoked when the authors state that the distance 'can then be used to situate these networks into a common input space'.

pith-pipeline@v0.9.0 · 5777 in / 1298 out tokens · 18655 ms · 2026-05-24T20:31:45.500174+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages · 1 internal anchor

[1]

C. C. Aggarwal, A. Hinneburg, and D. A. Keim. 2001. On the Surprising Behavior of Distance Metrics in High Dimensional Space. In Database Theory — ICDT 2001: 8th International Conference (LNCS) . London, UK

work page 2001
[2]

W. J. Conover and R. L. Iman. 1979.On Multiple-comparisons Procedures. Technical Report LA-7677-MS. Los Alamos Sci. Lab

work page 1979
[3]

S. J. Daniels, A. A. M. Rahat, R. M. Everson, G. R. Tabor, and J. E. Fieldsend

work page
[4]

In International Conference on Parallel Problem Solving from Nature

A Suite of Computationally Expensive Shape Optimisation Problems Using Computational Fluid Dynamics. In International Conference on Parallel Problem Solving from Nature

work page
[5]

R. Dawkins. 1982. The Extended Phenotype . Oxford University Press Oxford

work page 1982
[6]

De Grave, J

K. De Grave, J. Ramon, and L. De Raedt. 2008. Active Learning for High Through- put Screening. In International Conference on Discovery Science

work page 2008
[7]

Doncieux and J.-B

S. Doncieux and J.-B. Mouret. 2010. Behavioral Diversity Measures for Evolu- tionary Robotics. In IEEE Congress on Evolutionary Computation

work page 2010
[8]

Forrester, A

A. Forrester, A. Sobester, and A. Keane. 2008. Engineering Design via Surrogate Modelling. John Wiley & Sons

work page 2008
[9]

J. M. Gablonsky and C. T. Kelley. 2001. A Locally-Biased form of the DIRECT Algorithm. Journal of Global Optimization

work page 2001
[10]

Gaier, A

A. Gaier, A. Asteroth, and J.-B. Mouret. 2018. Data-efficient Neuroevolution with Kernel-Based Surrogate Models. In Proceedings of the Genetic and Evolutionary Computation Conference

work page 2018
[11]

Hildebrandt and J

T. Hildebrandt and J. Branke. 2015. On Using Surrogates with Genetic Program- ming. Evolutionary Computation

work page 2015
[12]

Y. Jin. 2011. Surrogate-assisted Evolutionary Computation: Recent Advances and Future Challenges. Swarm and Evolutionary Computation

work page 2011
[13]

Y. Jin, H. Wang, T. Chugh, D. Guo, and K. Miettinen. 2018. Data-driven Evo- lutionary Optimization: An Overview and Case Studies. IEEE Transactions on Evolutionary Computation

work page 2018
[14]

M. G. Kendall and J. D. Gibbons. 1990. Rank Correlation Methods. Oxford Univer- sity Press, London

work page 1990
[15]

J. R. Koza. 1994. Genetic programming. MIT Press

work page 1994
[16]

W. H. Kruskal and W. A. Wallis. 1952. Use of Ranks in One-Criterion Variance Analysis. J. Amer. Statist. Assoc

work page 1952
[17]

Moraglio, K

A. Moraglio, K. Krawiec, and C. G. Johnson. 2012. Geometric Semantic Genetic Programming. InInternational Conference on Parallel Problem Solving from Nature

work page 2012
[18]

J.-B. Mouret. 2011. Encouraging Behavioral Diversity in Evolutionary Robotics: An Empirical Study. Evolutionary Computation

work page 2011
[19]

Illuminating search spaces by mapping elites

J.-B. Mouret and J. Clune. 2015. Illuminating Search Spaces by Mapping Elites. arXiv:1504.04909v1

work page internal anchor Pith review Pith/arXiv arXiv 2015
[20]

Y. S. Ong, P. B. Nair, and A. J. Keane. 2003. Evolutionary Optimization of Compu- tationally Expensive Problems via Surrogate Modeling. AIAA Journal

work page 2003
[21]

T. Pohlert. 2018. PMCMRplus: Calculate Pairwise Multiple Comparisons of Mean Rank Sums Extended - R package, version 1.4.1

work page 2018
[22]

R Core Team. 2018. R: A Language and Environment for Statistical Computing

work page 2018
[23]

C. E. Rasmussen. 2004. Gaussian Processes in Machine Learning. In Advanced Lectures on Machine Learning . Springer

work page 2004
[24]

Snoek, H

J. Snoek, H. Larochelle, and R. P. Adams. 2012. Practical Bayesian Optimization of Machine Learning Algorithms. In Advances in Neural Information Processing Systems

work page 2012
[25]

K. O. Stanley. 2006. Exploiting Regularity Without Development. In Proceedings of the AAAI Fall Symposium on Developmental Systems . AAAI Press

work page 2006
[26]

K. O. Stanley and R. Miikkulainen. 2002. Evolving neural networks through augmenting topologies. Evolutionary Computation

work page 2002
[27]

Stork, M

J. Stork, M. Zaefferer, and T. Bartz-Beielstein. 2019. Improving NeuroEvolution Efficiency by Surrogate Model-Based Optimization with Phenotypic Distance Kernels. In Applications of Evolutionary Computation

work page 2019
[28]

Stork, M

J. Stork, M. Zaefferer, A. Fischbach, and T. Bartz-Beielstein. 2017. Surrogate- Assisted Learning of Neural Networks. InProceedings 27. Workshop Computational Intelligence

work page 2017
[29]

W. N. Venables and B. D. Ripley. 2002. Modern Applied Statistics with S. Springer

work page 2002
[30]

H. Wang, Y. Jin, and J. O. Jansen. 2016. Data-driven Surrogate-assisted Multi- objective Evolutionary Optimization of a Trauma System. IEEE Transactions on Evolutionary Computation

work page 2016
[31]

Zaefferer

M. Zaefferer. 2019. Combinatorial Efficient Global Optimization in R - CEGO v2.3.0. https://cran.r-project.org/package=CEGO accessed: 2019-03-19

work page 2019
[32]

Zaefferer, J

M. Zaefferer, J. Stork, O. Flasch, and T. Bartz-Beielstein. 2018. Linear Combination of Distance Measures for Surrogate Models in Genetic Programming. In Parallel Problem Solving from Nature – PPSN XV . Coimbra, Portugal

work page 2018

[1] [1]

C. C. Aggarwal, A. Hinneburg, and D. A. Keim. 2001. On the Surprising Behavior of Distance Metrics in High Dimensional Space. In Database Theory — ICDT 2001: 8th International Conference (LNCS) . London, UK

work page 2001

[2] [2]

W. J. Conover and R. L. Iman. 1979.On Multiple-comparisons Procedures. Technical Report LA-7677-MS. Los Alamos Sci. Lab

work page 1979

[3] [3]

S. J. Daniels, A. A. M. Rahat, R. M. Everson, G. R. Tabor, and J. E. Fieldsend

work page

[4] [4]

In International Conference on Parallel Problem Solving from Nature

A Suite of Computationally Expensive Shape Optimisation Problems Using Computational Fluid Dynamics. In International Conference on Parallel Problem Solving from Nature

work page

[5] [5]

R. Dawkins. 1982. The Extended Phenotype . Oxford University Press Oxford

work page 1982

[6] [6]

De Grave, J

K. De Grave, J. Ramon, and L. De Raedt. 2008. Active Learning for High Through- put Screening. In International Conference on Discovery Science

work page 2008

[7] [7]

Doncieux and J.-B

S. Doncieux and J.-B. Mouret. 2010. Behavioral Diversity Measures for Evolu- tionary Robotics. In IEEE Congress on Evolutionary Computation

work page 2010

[8] [8]

Forrester, A

A. Forrester, A. Sobester, and A. Keane. 2008. Engineering Design via Surrogate Modelling. John Wiley & Sons

work page 2008

[9] [9]

J. M. Gablonsky and C. T. Kelley. 2001. A Locally-Biased form of the DIRECT Algorithm. Journal of Global Optimization

work page 2001

[10] [10]

Gaier, A

A. Gaier, A. Asteroth, and J.-B. Mouret. 2018. Data-efficient Neuroevolution with Kernel-Based Surrogate Models. In Proceedings of the Genetic and Evolutionary Computation Conference

work page 2018

[11] [11]

Hildebrandt and J

T. Hildebrandt and J. Branke. 2015. On Using Surrogates with Genetic Program- ming. Evolutionary Computation

work page 2015

[12] [12]

Y. Jin. 2011. Surrogate-assisted Evolutionary Computation: Recent Advances and Future Challenges. Swarm and Evolutionary Computation

work page 2011

[13] [13]

Y. Jin, H. Wang, T. Chugh, D. Guo, and K. Miettinen. 2018. Data-driven Evo- lutionary Optimization: An Overview and Case Studies. IEEE Transactions on Evolutionary Computation

work page 2018

[14] [14]

M. G. Kendall and J. D. Gibbons. 1990. Rank Correlation Methods. Oxford Univer- sity Press, London

work page 1990

[15] [15]

J. R. Koza. 1994. Genetic programming. MIT Press

work page 1994

[16] [16]

W. H. Kruskal and W. A. Wallis. 1952. Use of Ranks in One-Criterion Variance Analysis. J. Amer. Statist. Assoc

work page 1952

[17] [17]

Moraglio, K

A. Moraglio, K. Krawiec, and C. G. Johnson. 2012. Geometric Semantic Genetic Programming. InInternational Conference on Parallel Problem Solving from Nature

work page 2012

[18] [18]

J.-B. Mouret. 2011. Encouraging Behavioral Diversity in Evolutionary Robotics: An Empirical Study. Evolutionary Computation

work page 2011

[19] [19]

Illuminating search spaces by mapping elites

J.-B. Mouret and J. Clune. 2015. Illuminating Search Spaces by Mapping Elites. arXiv:1504.04909v1

work page internal anchor Pith review Pith/arXiv arXiv 2015

[20] [20]

Y. S. Ong, P. B. Nair, and A. J. Keane. 2003. Evolutionary Optimization of Compu- tationally Expensive Problems via Surrogate Modeling. AIAA Journal

work page 2003

[21] [21]

T. Pohlert. 2018. PMCMRplus: Calculate Pairwise Multiple Comparisons of Mean Rank Sums Extended - R package, version 1.4.1

work page 2018

[22] [22]

R Core Team. 2018. R: A Language and Environment for Statistical Computing

work page 2018

[23] [23]

C. E. Rasmussen. 2004. Gaussian Processes in Machine Learning. In Advanced Lectures on Machine Learning . Springer

work page 2004

[24] [24]

Snoek, H

J. Snoek, H. Larochelle, and R. P. Adams. 2012. Practical Bayesian Optimization of Machine Learning Algorithms. In Advances in Neural Information Processing Systems

work page 2012

[25] [25]

K. O. Stanley. 2006. Exploiting Regularity Without Development. In Proceedings of the AAAI Fall Symposium on Developmental Systems . AAAI Press

work page 2006

[26] [26]

K. O. Stanley and R. Miikkulainen. 2002. Evolving neural networks through augmenting topologies. Evolutionary Computation

work page 2002

[27] [27]

Stork, M

J. Stork, M. Zaefferer, and T. Bartz-Beielstein. 2019. Improving NeuroEvolution Efficiency by Surrogate Model-Based Optimization with Phenotypic Distance Kernels. In Applications of Evolutionary Computation

work page 2019

[28] [28]

Stork, M

J. Stork, M. Zaefferer, A. Fischbach, and T. Bartz-Beielstein. 2017. Surrogate- Assisted Learning of Neural Networks. InProceedings 27. Workshop Computational Intelligence

work page 2017

[29] [29]

W. N. Venables and B. D. Ripley. 2002. Modern Applied Statistics with S. Springer

work page 2002

[30] [30]

H. Wang, Y. Jin, and J. O. Jansen. 2016. Data-driven Surrogate-assisted Multi- objective Evolutionary Optimization of a Trauma System. IEEE Transactions on Evolutionary Computation

work page 2016

[31] [31]

Zaefferer

M. Zaefferer. 2019. Combinatorial Efficient Global Optimization in R - CEGO v2.3.0. https://cran.r-project.org/package=CEGO accessed: 2019-03-19

work page 2019

[32] [32]

Zaefferer, J

M. Zaefferer, J. Stork, O. Flasch, and T. Bartz-Beielstein. 2018. Linear Combination of Distance Measures for Surrogate Models in Genetic Programming. In Parallel Problem Solving from Nature – PPSN XV . Coimbra, Portugal

work page 2018