pith. sign in

arxiv: 2009.10460 · v1 · submitted 2020-09-22 · 💻 cs.NE

Multi-threaded Memory Efficient Crossover in C++ for Generational Genetic Programming

Pith reviewed 2026-05-24 14:13 UTC · model grok-4.3

classification 💻 cs.NE
keywords genetic programmingcrossovermemory efficiencymulti-threadingC++generational evolutionary algorithms
0
0 comments X

The pith

C++ code for generational genetic programming crossover keeps memory use to M plus twice the thread count.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper supplies C++ code snippets that implement crossover for genetic programming in a multi-threaded way. The implementation ensures that no more than M plus twice the number of threads worth of individuals need to be held in memory at once. This design targets separate-generation evolutionary algorithms that face constraints from large chromosomes or limited RAM. A reader would care because it directly tackles the practical barrier of memory consumption when scaling up population-based search methods. The code is presented as adaptable rather than a complete standalone program.

Core claim

The paper claims that crossover can be coded in C++ for multi-core execution such that memory for only M plus 2 times nthreads simultaneously active individuals is required, and provides the relevant code snippets for use in generational genetic programming and similar separate-generation evolutionary algorithms.

What carries the argument

The memory-efficient multi-threaded crossover operator that reuses individual storage slots across threads without extra hidden allocations.

If this is right

  • Generational GP runs become feasible on machines whose RAM would otherwise be exhausted by holding two full populations.
  • Parallel crossover threads can operate without each thread needing its own full copy of the population.
  • The same memory bound applies when adapting the snippets to other separate-generation evolutionary algorithms.
  • Experiments with bigger chromosome sizes become possible without increasing total RAM proportionally.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same slot-reuse pattern might apply to mutation or other variation operators if they can be scheduled to avoid simultaneous copies.
  • On NUMA systems the thread-to-individual mapping could further reduce cache misses if individuals are placed near their processing thread.
  • The bound assumes a representation where individuals can be moved or overwritten in place; tree-based GP with heavy pointer structures would need explicit testing.

Load-bearing premise

The crossover code can be written so that only M plus 2 times nthreads individuals occupy memory at once, with no extra allocations from threading or data structures.

What would settle it

Compiling and running the supplied code on a workload with large chromosomes while tracking peak resident memory set size; if usage consistently exceeds M plus 2 times nthreads individuals the claim would not hold.

Figures

Figures reproduced from arXiv: 2009.10460 by W. B. Langdon.

Figure 1
Figure 1. Figure 1: In generational schemes (e.g. µ, µ) the current population (on the left) is completely replaced by the next population (on the right). The area of red dots is proportional to the number of children it will be a parent of in the next generation. Small white dots are infertile. We describe how to avoid simultaneously storing both populations. 1 Background It has been known for a long time that in conventiona… view at source ↗
Figure 2
Figure 2. Figure 2: In steady state Evolutionary Algorithms (e.g. [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Example generational GP run on 8 core i7-4790 3.60GHz CPU with 31.2GB of memory. The [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
read the original abstract

C++ code snippets from a multi-core parallel memory-efficient crossover for genetic programming are given. They may be adapted for separate generation evolutionary algorithms where large chromosomes or small RAM require no more than M + (2 times nthreads) simultaneously active individuals.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript presents C++ code snippets for a multi-threaded memory-efficient crossover operator intended for generational genetic programming. It claims that the implementation can be adapted to ensure that no more than M + (2 * nthreads) individuals are simultaneously active in memory, where M denotes population size, for use in separate-generation evolutionary algorithms under constraints of large chromosomes or limited RAM.

Significance. The manuscript supplies reproducible C++ code snippets that practitioners could directly adapt for parallel generational GP. If the claimed memory bound holds without hidden allocations, this would provide a concrete engineering contribution for memory-constrained settings; however, the absence of any verification means the practical significance remains unestablished.

major comments (1)
  1. [Abstract] Abstract: The central claim that the supplied crossover implementation requires only M + (2 * nthreads) simultaneously active individuals is unsupported by any memory analysis, peak-usage measurements, allocator instrumentation, or discussion of potential overheads from the threading model, per-thread temporaries, crossover node handling, or the specific chromosome representation. This assumption is load-bearing for the paper's contribution.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed comment on the memory bound claim. The manuscript is a short code-focused note whose primary contribution is reproducible C++ snippets; we address the concern about substantiation below and will incorporate a clarifying explanation in revision.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that the supplied crossover implementation requires only M + (2 * nthreads) simultaneously active individuals is unsupported by any memory analysis, peak-usage measurements, allocator instrumentation, or discussion of potential overheads from the threading model, per-thread temporaries, crossover node handling, or the specific chromosome representation. This assumption is load-bearing for the paper's contribution.

    Authors: We agree that the manuscript contains no empirical memory measurements, allocator traces, or overhead discussion. The bound is asserted from the explicit structure of the supplied code rather than from runtime profiling: the generational population of M individuals is retained, while each thread allocates exactly two temporary individuals (one parent copy and one child) for crossover, with all other temporaries being stack locals or reused buffers that do not create additional active individuals. Threading overhead is confined to per-thread storage for these two individuals; no hidden heap allocations for chromosomes occur beyond the stated limit because the snippets manage vectors explicitly and avoid per-node temporaries outside the two-individual budget. Because the paper supplies the code itself, practitioners can inspect and confirm the accounting. We will add a short paragraph after the code listing that walks through this per-thread accounting and note the absence of measurements as a limitation of the note format. revision: yes

Circularity Check

0 steps flagged

No circularity; direct code description without derivations or predictions

full rationale

The paper consists of C++ code snippets for a multi-threaded crossover operator in generational genetic programming. It states that the implementation can be adapted to require no more than M + (2 * nthreads) simultaneously active individuals. There are no equations, fitted parameters, predictions derived from inputs, self-citations used as load-bearing premises, uniqueness theorems, or ansatzes. The contribution is a straightforward presentation of implementation details rather than any derivation chain that could reduce to its own inputs by construction. The memory claim is presented as a direct consequence of the code structure, not as a result obtained via fitting or self-referential citation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper supplies implementation code rather than a mathematical model; no free parameters, axioms, or invented entities are required or introduced.

pith-pipeline@v0.9.0 · 5550 in / 959 out tokens · 21974 ms · 2026-05-24T14:13:54.668794+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

  1. [1]

    Banzhaf, P

    W. Banzhaf, P. Nordin, R. E. Keller, and F. D. Francone. Genetic Programming – An Introduction; On the Automatic Evolution of Computer Programs and its Applications . Morgan Kaufmann. 1998

  2. [2]

    J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection . MIT Press, Cambridge, MA, USA, 1992

  3. [3]

    J. R. Koza, D. Andre, F. H. Bennett III, and M. Keane.Genetic Programming III: Darwinian Invention and Problem Solving . Morgan Kaufman, Apr. 1999

  4. [4]

    W. B. Langdon. Genetically improved software. In A. H. Gandomi, A. H. Alavi, and C. Ryan, editors, Handbook of Genetic Programming Applications, chapter 8, pages 181–220. Springer, 2015

  5. [5]

    W. B. Langdon. Long-term evolution of genetic programming populations. In GECCO 2017: The Genetic and Evolutionary Computation Conference , pages 235–236, Berlin, 15-19 July 2017. ACM

  6. [6]

    W. B. Langdon. Parallel GPQUICK. In C. Doerr, editor, GECCO ’19: Proceedings of the Genetic and Evolutionary Computation Conference Companion , pages 63–64, Prague, Czech Republic, July 13-17 2019. ACM

  7. [7]

    W. B. Langdon and W. Banzhaf. Repeated patterns in genetic programming. Natural Computing, 7(4):589–613, Dec. 2008

  8. [8]

    W. B. Langdon and W. Banzhaf. Continuous long-term evolution of genetic programming. In R. Fuech- slin, editor, Conference on Artificial Life (ALIFE 2019) , pages 388–395, Newcastle, 29 July - 2 Aug

  9. [9]

    W. B. Langdon and A. P. Harrison. GP on SPMD parallel graphics hardware for mega bioinformatics data mining. Soft Computing, 12(12):1169–1183, Oct. 2008. Special Issue on Distributed Bioinspired Algorithms

  10. [10]

    W. B. Langdon and R. Poli. Fitness causes bloat. In P. K. Chawdhry, R. Roy, and R. K. Pant, editors, Soft Computing in Engineering Design and Manufacturing , pages 13–22. Springer-Verlag, 1997

  11. [11]

    J. F. Miller, editor. Cartesian Genetic Programming. Natural Computing Series. Springer, 2011

  12. [12]

    P. Nordin. Evolutionary Program Induction of Binary Machine Code and its Applications . PhD thesis, der Universitat Dortmund am Fachereich Informatik, Germany, 1997

  13. [13]

    O’Neill and C

    M. O’Neill and C. Ryan. Grammatical Evolution: Evolutionary Automatic Programming in a Arbitrary Language, volume 4 of Genetic programming. Kluwer Academic Publishers, 2003

  14. [14]

    R. Poli, W. B. Langdon, and N. F. McPhee. A field guide to genetic programming . Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk, 2008. (With contri- butions by J. R. Koza)

  15. [15]

    Singleton

    A. Singleton. Genetic programming with C++. BYTE, pages 171–176, Feb. 1994

  16. [16]

    ERROR ran out of expr buffers

    G. Syswerda. A study of reproduction in generational and steady state genetic algorithms. In G. J. E. Rawlings, editor, Foundations of genetic algorithms , pages 94–101. Morgan Kaufmann, Indiana Uni- versity, 15-18 July 1990. Published 1991. 6 A GPquick C++ Code Snippets. chrome.cxx Revision: 1.262 Standard C++ conditional compilation #ifndef NDEBUG and a...