pith. sign in

arxiv: 2605.21055 · v1 · pith:25VF34Q4new · submitted 2026-05-20 · 💻 cs.NE · cs.LG

Genetic Programming with Transformer-Based Mutation for Approximate Circuit Design

Pith reviewed 2026-05-21 01:33 UTC · model grok-4.3

classification 💻 cs.NE cs.LG
keywords approximate circuitsCartesian genetic programmingtransformer mutationevolutionary designhybrid searcharithmetic circuit approximation
0
0 comments X

The pith

Transformer-based mutation in genetic programming produces approximate multipliers with better error-efficiency trade-offs than prior optimized designs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a transformer model trained to suggest mutations for Cartesian genetic programming applied to approximate arithmetic circuits. It combines this learned operator with the standard random mutation in a switching scheme that keeps the evolutionary search from stalling. Training draws on thousands of chromosome encodings drawn from earlier circuit evolution runs. For multiple target error levels the resulting multiplier designs improve the balance of approximation accuracy against hardware cost compared with existing highly optimized examples. The computational cost of training and search is presented as the price for obtaining new circuit structures.

Core claim

A transformer trained on collections of CGP chromosomes representing approximate multipliers can function as a mutation operator; when alternated with conventional mutation in a hybrid CGP run, the process yields approximate multipliers that satisfy given error constraints with more favorable implementation costs than previously available designs.

What carries the argument

The hybrid mutation scheme that alternates a transformer predictor of useful CGP chromosome changes with standard point mutation, operating on graph-based encodings of circuit topologies.

If this is right

  • Approximate multipliers meeting fixed error limits can be found with lower power or area than earlier manual or evolutionary results.
  • The alternating mutation schedule keeps the search progressing across many generations.
  • Training data built from thousands of prior CGP runs supplies the transformer with patterns that produce productive structural edits.
  • The same pipeline can be repeated for other arithmetic blocks once suitable training chromosomes are collected.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same learned-mutation idea could be tested on the design of approximate adders or other digital blocks.
  • If the transformer captures reusable patterns of useful circuit edits, it may lower the amount of domain knowledge needed to set up evolutionary runs.
  • Extending the chromosome length and training set size would show whether the method remains effective for larger circuits.
  • Combining the transformer operator with other forms of machine-learning guidance might produce further gains in search efficiency.

Load-bearing premise

The hybrid switching rule together with training on chosen sets of CGP chromosomes will generate mutations that improve circuit topologies without overfitting to the training examples or biasing results through the selected error measures.

What would settle it

A side-by-side measurement of power, area, and delay for circuits produced under identical error thresholds; if the new designs show no consistent advantage, the claimed improvement does not hold.

Figures

Figures reproduced from arXiv: 2605.21055 by Lukas Sekanina, Ondrej Galeta.

Figure 1
Figure 1. Figure 1: Example graph and chromosome representation of [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Proposed transformer-based mutation operator [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Transformer training and incorporating the trained [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 3
Figure 3. Figure 3: The principle of multiplier selection from a dataset of [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Training progress of the proposed transformer model [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Progress of CGP TR for different values of the maximal [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 6
Figure 6. Figure 6: Fitness progress in CGP utilizing standard mutation [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: WCE and area of the best approximate 8-bit multipliers [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
read the original abstract

A recent trend is to leverage machine learning models to improve the evolutionary design and optimization process. We propose a novel transformer-based mutation operator for Cartesian genetic programming (CGP) for the automated design of approximate arithmetic circuits. We introduce a hybrid scheme for CGP in which the proposed mutation operator is switched with the standard mutation operator to prevent stagnation of the circuit approximation process. We also develop a new training scheme for the underlying transformer that utilizes training vectors composed of thousands of CGP chromosomes representing various approximate multipliers. For several target error constraints, the approximate multipliers evolved with CGP utilizing the transformer-based mutation achieve better trade-offs than the highly optimized designs available in the state-of-the-art EvoApproxLib library of approximate circuits. Although both training and evolutionary processes are computationally demanding, they appear to be necessary steps for improving existing approximate circuits and producing new, potentially patentable circuit designs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a transformer-based mutation operator for Cartesian Genetic Programming (CGP) applied to approximate multiplier design. A hybrid switching scheme alternates the transformer mutation with standard CGP mutation to avoid stagnation. The transformer is trained on vectors consisting of thousands of prior CGP chromosomes representing approximate multipliers. The central empirical claim is that, for several target error constraints, the resulting evolved circuits achieve superior error-area trade-offs relative to the highly optimized designs in the EvoApproxLib library.

Significance. If the performance gains hold under rigorous controls, the work would demonstrate a practical way to inject learned mutation distributions into CGP, addressing a known stagnation issue while generating new circuit topologies. The explicit acknowledgment that both training and evolution are computationally intensive yet necessary for improvement beyond existing libraries is a strength; the approach also opens a route to potentially patentable designs in approximate computing.

major comments (2)
  1. [§4] §4 (Experimental Results): the central claim of superior trade-offs versus EvoApproxLib is load-bearing yet reported without quantitative deltas, error bars, or statistical significance tests; without these the magnitude and reliability of the improvement cannot be assessed.
  2. [§3.2] §3.2 (Hybrid Scheme and Training): the hybrid switching frequency and the choice of training vectors drawn exclusively from prior CGP runs are free parameters whose selection procedure is not detailed; this leaves open the possibility that observed gains reflect bias in the learned distribution rather than genuine generalization to new topologies.
minor comments (2)
  1. [Abstract] Abstract: the phrase 'better trade-offs' should be accompanied by the precise metrics (e.g., area, power, mean squared error) used for comparison.
  2. [Figures] Figure captions: legends distinguishing the proposed hybrid CGP from pure CGP and from EvoApproxLib baselines are occasionally unclear.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments on our manuscript. We address each major comment below in a point-by-point manner and indicate the specific revisions we will incorporate to improve the rigor and clarity of the work.

read point-by-point responses
  1. Referee: [§4] §4 (Experimental Results): the central claim of superior trade-offs versus EvoApproxLib is load-bearing yet reported without quantitative deltas, error bars, or statistical significance tests; without these the magnitude and reliability of the improvement cannot be assessed.

    Authors: We agree that the presentation of results would benefit from explicit quantitative support. The manuscript currently demonstrates superior trade-offs through Pareto-optimal fronts and tabulated circuit metrics for selected error constraints, but does not report numerical deltas, error bars from repeated runs, or formal statistical tests. We will revise Section 4 to add a summary table listing percentage improvements in area (or power) relative to the closest EvoApproxLib designs, together with standard deviations obtained from 10 independent evolutionary runs and p-values from paired statistical tests. These additions will allow direct assessment of both effect size and reliability. revision: yes

  2. Referee: [§3.2] §3.2 (Hybrid Scheme and Training): the hybrid switching frequency and the choice of training vectors drawn exclusively from prior CGP runs are free parameters whose selection procedure is not detailed; this leaves open the possibility that observed gains reflect bias in the learned distribution rather than genuine generalization to new topologies.

    Authors: The switching interval in the hybrid scheme was selected after preliminary experiments showed that frequent alternation prevented the stagnation commonly observed with pure CGP mutation. Training vectors were assembled from chromosomes generated across multiple prior CGP runs that spanned a wide range of error targets in order to encourage the transformer to learn a general mutation distribution. Nevertheless, the manuscript does not provide a full account of the tuning process or explicit criteria used to curate the training set. We will expand Section 3.2 with a description of the preliminary tuning experiments, the diversity metrics applied to the training chromosomes, and new results showing the transformer’s performance when applied to completely independent CGP populations not seen during training. This will help demonstrate that the gains are not solely due to distribution bias. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical results rest on external library comparison

full rationale

The paper trains a transformer on CGP chromosomes to serve as a mutation operator within a hybrid CGP scheme and then reports evolved circuit quality against the independent EvoApproxLib library for given error constraints. No equation, parameter fit, or self-citation is shown to define the target performance metric or to force the reported trade-off improvements by construction. The central claim therefore remains an empirical observation rather than a tautological restatement of the training inputs.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The approach depends on several empirical choices whose justification is not visible from the abstract: the size and composition of the training chromosome set, the frequency of switching between mutation operators, and the precise error metrics used to label training data. No new physical entities are postulated.

free parameters (2)
  • transformer training vector count
    Thousands of CGP chromosomes are used; exact count and selection criteria are chosen to train the model and directly affect mutation quality.
  • hybrid switch frequency
    The rate at which transformer mutation is alternated with standard mutation is a design parameter that prevents stagnation but must be tuned.
axioms (1)
  • domain assumption The transformer can learn generalizable mutation patterns from CGP chromosomes that improve evolutionary search beyond random mutation.
    Invoked when claiming the hybrid scheme produces better trade-offs; no proof or external validation is supplied in the abstract.

pith-pipeline@v0.9.0 · 5673 in / 1361 out tokens · 30522 ms · 2026-05-21T01:33:35.571742+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · 1 internal anchor

  1. [1]

    A survey of techniques for approximate computing,

    S. Mittal, “A survey of techniques for approximate computing,”ACM Comput. Surv., vol. 48, no. 4, pp. 62:1–62:33, 2016

  2. [2]

    Hardware approximate techniques for deep neural network accelerators: A survey,

    G. Armeniakos, G. Zervakis, D. Soudris, and J. Henkel, “Hardware approximate techniques for deep neural network accelerators: A survey,” ACM Comput. Surv., vol. 55, no. 4, pp. 83:1–83:36, 2023

  3. [3]

    Approximate arithmetic circuits: A survey, characterization, and recent applications,

    H. Jiang, F. J. H. Santiago, H. Mo, L. Liu, and J. Han, “Approximate arithmetic circuits: A survey, characterization, and recent applications,” Proc. IEEE, vol. 108, no. 12, pp. 2108–2135, 2020

  4. [4]

    J. F. Miller,Cartesian Genetic Programming, 1st ed. Berlin, Germany: Springer-Verlag, 2011

  5. [5]

    Evoapprox8b: Library of approximate adders and multipliers for circuit design and benchmarking of approxi- mation methods,

    V . Mrazek, R. Hrbaceket al., “Evoapprox8b: Library of approximate adders and multipliers for circuit design and benchmarking of approxi- mation methods,” inProc. of DATE’17, 2017, pp. 258–261

  6. [6]

    Libraries of approximate circuits: Automated design and application in cnn accelerators,

    V . Mrazek, L. Sekanina, and Z. Vasicek, “Libraries of approximate circuits: Automated design and application in cnn accelerators,”IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 10, no. 4, pp. 406–418, 2020

  7. [7]

    Approximating complex arithmetic circuits with formal error guaran- tees: 32-bit multipliers accomplished,

    M. Ceska, J. Matyas, V . Mrazek, L. Sekanina, Z. Vasicek, and T. V ojnar, “Approximating complex arithmetic circuits with formal error guaran- tees: 32-bit multipliers accomplished,” inProc. of 36th IEEE/ACM Int. Conf. On Computer Aided Design. IEEE, 2017, pp. 416–423

  8. [8]

    Sagtree: Towards efficient mutation in evolutionary circuit approxima- tion,

    M. Ceska, J. Matys, V . Mrazek, L. Sekanina, Z. Vasicek, and T. V ojnar, “Sagtree: Towards efficient mutation in evolutionary circuit approxima- tion,”Swarm Evol. Comput., vol. 69, p. 100986, 2022

  9. [9]

    Evolutionary computation in the era of large language model: Survey and roadmap,

    X. Wu, S.-H. Wu, J. Wu, L. Feng, and K. C. Tan, “Evolutionary computation in the era of large language model: Survey and roadmap,” IEEE Transactions on Evolutionary Computation, vol. 29, no. 2, pp. 534–554, 2025

  10. [10]

    Hemberg, S

    E. Hemberg, S. Jorgensen, and U.-M. O’Reilly,Survey of Genetic Programming and Large Language Models. Springer Nature Singapore, 2025, pp. 67–86

  11. [11]

    A comparison of large language models and genetic programming for program synthesis,

    D. Sobania, J. Petke, M. Briesch, and F. Rothlauf, “A comparison of large language models and genetic programming for program synthesis,” IEEE Trans. Evol. Comput., vol. 29, no. 4, pp. 1434–1448, 2025

  12. [12]

    Exploiting errors for efficiency: A survey from circuits to applications,

    P. Stanley-Marbell, A. Alaghi, M. Carbin, E. Darulova, L. Dolecek, A. Gerstlauer, G. Gillani, D. Jevdjic, T. Moreau, M. Cacciotti, A. Daglis, N. E. Jerger, B. Falsafi, S. Misailovic, A. Sampson, and D. Zufferey, “Exploiting errors for efficiency: A survey from circuits to applications,” ACM Comput. Surv., vol. 53, no. 3, 2020

  13. [13]

    Design of approximate logarithmic multipliers,

    W. Liu, J. Xu, D. Wang, and F. Lombardi, “Design of approximate logarithmic multipliers,” inProceedings of the Great Lakes Symposium on VLSI 2017, ser. GLSVLSI ’17. New York, NY , USA: Association for Computing Machinery, 2017, p. 47–52. [Online]. Available: https://doi.org/10.1145/3060403.3060409

  14. [14]

    Bio-inspired imprecise computational blocks for efficient vlsi implementation of soft- computing applications,

    H. R. Mahdiani, A. Ahmadi, S. M. Fakhraie, and C. Lucas, “Bio-inspired imprecise computational blocks for efficient vlsi implementation of soft- computing applications,”IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 57, no. 4, pp. 850–862, April 2010

  15. [15]

    Jump search: A fast technique for the synthesis of approximate circuits,

    L. Witschen, H. G. Mohammadi, M. Artmann, and M. Platzner, “Jump search: A fast technique for the synthesis of approximate circuits,” in Proceedings of the 2019 on Great Lakes Symposium on VLSI, GLSVLSI. ACM, 2019, pp. 153–158

  16. [16]

    Gptac: Domain- specific generative pre-trained model for approximate circuit design exploration,

    S. Yi, W. Zuo, H. Wu, R. Dai, W. Qian, and J. Chen, “Gptac: Domain- specific generative pre-trained model for approximate circuit design exploration,”IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 15, no. 2, pp. 349–360, 2025

  17. [17]

    Attention is all you need,

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems, 2017, pp. 5998–6008

  18. [18]

    BERT: Pre-training of deep bidirectional transformers for language understanding,

    J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” inProc. of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, vol. 1. Association for Computational Linguistics, Jun. 2019, pp. 4171–4186

  19. [19]

    VeriGen: A large language model for verilog code generation,

    S. Thakur, B. Ahmad, H. Pearce, B. Tan, B. Dolan-Gavitt, R. Karri, and S. Garg, “Verigen: A large language model for verilog code generation,”ACM Trans. Design Autom. Electr. Syst., vol. 29, no. 3, pp. 46:1–46:31, 2024. [Online]. Available: https://doi.org/10.1145/3643681

  20. [20]

    Fundamentals of evolutionary machine learning,

    W. Banzhaf and P. Machado, “Fundamentals of evolutionary machine learning,” inHandbook of Evolutionary Machine Learning. Singapore: Springer Nature Singapore, 2024, pp. 3–28

  21. [21]

    Bert mutation: Deep transformer model for masked uniform mutation in genetic programming,

    E. Shem-Tov, M. Sipper, and A. Elyasaf, “Bert mutation: Deep transformer model for masked uniform mutation in genetic programming,”Mathematics, vol. 13, no. 5, 2025. [Online]. Available: https://www.mdpi.com/2227-7390/13/5/779

  22. [22]

    Symbolic regression trees as embedded representations,

    V . Caetano, M. C. Teixeira, and G. L. Pappa, “Symbolic regression trees as embedded representations,” inProceedings of the Genetic and Evo- lutionary Computation Conference, GECCO, S. Silva and L. Paquete, Eds. ACM, 2023, pp. 411–419

  23. [23]

    Transformers as surrogate models for genetic programming in automl tasks,

    M. C. Teixeira and G. L. Pappa, “Transformers as surrogate models for genetic programming in automl tasks,” inProceedings of the Genetic and Evolutionary Computation Conference, ser. GECCO ’25. New York, NY , USA: ACM, 2025, p. 472–480

  24. [24]

    Efficient phenotype evaluation in cartesian genetic programming,

    Z. Va ˇs´ıˇcek and K. Slan ´y, “Efficient phenotype evaluation in cartesian genetic programming,” inProc. of the 15th European Conf. on Genetic Programming, ser. LNCS, vol. 7244. Springer, 2012, pp. 266–278

  25. [25]

    Towards highly optimized cartesian genetic programming: From sequential via simd and thread to massive parallel implementation,

    R. Hrb ´aˇcek and L. Sekanina, “Towards highly optimized cartesian genetic programming: From sequential via simd and thread to massive parallel implementation,” inProc. of Genetic and Evolutionary Compu- tation. ACM, 2014, pp. 1015–1022

  26. [26]

    Regularizing Neural Networks by Penalizing Confident Output Distributions

    G. Pereyra, G. Tucker, J. Chorowski, Łukasz Kaiser, and G. Hinton, “Regularizing neural networks by penalizing confident output distributions,” 2017, arXiv preprint. [Online]. Available: https://arxiv.org/abs/1701.06548