pith. sign in

arxiv: 2601.19906 · v2 · submitted 2025-12-08 · 💻 cs.AR · cs.AI· cs.LG

GTAC: A Generative Transformer for Approximate Circuits

Pith reviewed 2026-05-17 00:38 UTC · model grok-4.3

classification 💻 cs.AR cs.AIcs.LG
keywords approximate computinglogic synthesistransformercircuit approximationerror-tolerant designgenerative modelhardware optimization
0
0 comments X

The pith

GTAC uses a generative transformer to approximate large circuits by partitioning them into subcircuits and selecting error-bounded candidates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes GTAC as a framework that applies generative transformers to approximate logic synthesis for applications that can tolerate small errors. Traditional incremental rewriting methods limit how thoroughly designers can explore tradeoffs in power, speed, and area. GTAC instead partitions a circuit into manageable pieces, generates multiple approximate versions of each piece with a transformer that masks out invalid error levels, and then assembles the best combination. This approach matters for embedded and signal-processing systems where modest accuracy loss yields big gains in efficiency. The method also introduces an irredundant encoding that shrinks the representation dramatically, making the transformer practical for circuits that would otherwise exceed memory limits.

Core claim

GTAC establishes an end-to-end generative pipeline for arbitrary-scale approximate logic synthesis. Large circuits are partitioned into subcircuits; a transformer core with irredundant encoding and error-bound masking produces candidate approximations for each; suitable candidates are selected and reassembled. The resulting designs reduce delay by 30.9 percent and gate count by 50.5 percent relative to exact generative baselines, while delivering 6.5 percent area savings and 4.3 times speedup over conventional ALS methods. The encoding itself cuts sequence length by 33.3 times and peak memory by 61.6 times compared with memoryless traversal.

What carries the argument

The generative transformer core that employs irredundant encoding to represent circuits compactly and applies a masking mechanism to exclude candidates that violate a target error bound.

If this is right

  • Error-tolerant applications can use substantially smaller and faster circuits than exact designs allow.
  • Design-space exploration shifts from incremental rewriting to direct probabilistic generation of valid approximations.
  • The self-evolutionary training loop lets the model refine its own candidate quality across iterations.
  • Memory and sequence reductions make transformer-based synthesis feasible for industrial-scale netlists.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The partitioning strategy could be combined with conventional synthesis tools to handle the parts of a design that must remain exact.
  • Similar generative masking techniques might transfer to other hardware tasks such as power gating or voltage scaling under error constraints.
  • Real deployment would require benchmarks that track both the claimed metrics and the cost of the final assembly step.

Load-bearing premise

That independent approximations generated for separate subcircuits can be selected and reassembled into a full circuit without the overall error exceeding the target bound or introducing hidden overhead.

What would settle it

Run GTAC on a large benchmark circuit, assemble the chosen subcircuit approximations, and measure the actual end-to-end output error; if it exceeds the specified bound while local subcircuit errors stay within limits, the central claim fails.

Figures

Figures reproduced from arXiv: 2601.19906 by Jingxin Wang, Ruicheng Dai, Ruogu Ding, Shitong Guo, Weikang Qian, Wenhui Liang, Xin Ning.

Figure 1
Figure 1. Figure 1: Comparison between (a) the exact logic synthesis and (b) the approximate logic synthesis using Transformer, which need 5 and 3 generation steps, respectively. techniques to simplify circuit netlists, including approximation￾aware rewriting of AND-inverter graphs (AIGs) [7], substitute-and￾simplify strategies like SASIMI [8], and advanced methods such as ALSRAC [9], SEALS [10], AccALS [11], AppResub [12], B… view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of "unfolding" a DAG with a multi [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of GTAC model. (a) Training: The model takes as inputs a circuit pair comprising the exact circuit and its ground-truth approximate circuit and learns to generate its approximate variants by minimizing a multi-objective loss (size and error). (b) Inference: For a new target circuit and error bound 𝜖, it generates a approximate circuit, with error-tolerant masking ensuring the output satisfies E (𝑔… view at source ↗
Figure 4
Figure 4. Figure 4: The GTAC pipeline has two stages. In the Training phase, the model learns a supervised mapping from exact to approximate circuits. In the Improvement phase, sampled circuits are refined via approximate synthesis, and the fine￾tuned Transformer with MCTS generates new candidates; high-quality pairs are added for iterative self-improvement. its starting point to discover novel solutions and explore a larger … view at source ↗
Figure 5
Figure 5. Figure 5: Pareto front of the evaluated design cases: (a) Error [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Pareto front comparison among ALSRAC, AppRe￾sub, HEDALS and GTAC: (a) Error rate versus delay; (b) Error rate versus area. achieves the lowest area against all methods when ER bound is 10%. 4.5 Self-Evolution on Approximate Datasets We self-evolve GTAC using another approximate circuit dataset constructed based on the IWLS dataset. This dataset, featuring diverse circuit pairs with controlled error bounds,… view at source ↗
read the original abstract

Targeting error-tolerant applications, approximate computing relaxes rigid functional equivalence to significantly improve power, performance, and area. Traditional approximate logic synthesis (ALS) relies on incremental rewriting, limiting design space exploration. Meanwhile, the inherently probabilistic nature of Transformer-based generative AI makes it a natural fit for generating approximate circuits. Exploiting this, we propose GTAC, an end-to-end framework for arbitrary-scale generative ALS. To overcome the memory bottleneck of generative AI, GTAC partitions a large circuit into tractable subcircuits, applies a generative core to produce approximate candidates for each subcircuit, and finally selects proper candidates to form the final design. Its core generative Transformer utilizes a novel irredundant encoding to compactly encode a circuit, alongside a masking mechanism to exclude designs violating the given error bound. Empowered by a self-evolutionary training strategy, GTAC establishes a new paradigm that demonstrates superior performance: It reduces delay by 30.9% and gate count by 50.5% over exact generative baselines and saves 6.5% area with a 4.3x speedup against traditional ALS methods. Furthermore, its irredundant encoding achieves a 33.3x reduction in sequence length and a 61.6x reduction in peak memory compared to conventional memoryless traversal.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes GTAC, an end-to-end generative Transformer framework for approximate logic synthesis (ALS) targeting error-tolerant applications. It partitions large circuits into subcircuits, employs a core Transformer with a novel irredundant encoding and masking mechanism to generate approximate candidates while respecting per-subcircuit error bounds, and selects/assembles candidates into a final design. The work claims 30.9% delay reduction and 50.5% gate-count reduction versus exact generative baselines, 6.5% area savings with 4.3x speedup versus traditional ALS, plus 33.3x sequence-length and 61.6x peak-memory reductions from the encoding.

Significance. If the central claims hold after addressing composition issues, GTAC would introduce a scalable generative-AI paradigm for ALS that overcomes the design-space limits of incremental rewriting methods, offering concrete efficiency gains for approximate circuits at arbitrary scale.

major comments (2)
  1. [Framework and Assembly Description] The abstract and framework description state that subcircuits are approximated independently and then assembled to form the final circuit while preserving the target error bound, yet supply no composition rule, error-propagation analysis, or post-assembly verification procedure. This leaves the reported global metrics (30.9% delay reduction, 50.5% gate-count reduction) dependent on an unverified premise whose failure would invalidate the headline results.
  2. [Experimental Results] Table or results section reporting the assembled-circuit metrics: the manuscript asserts concrete percentage improvements and a 4.3x speedup but provides no information on the benchmarks used, the precise error metrics, how subcircuit candidates are selected to meet global constraints, or statistical significance of the gains.
minor comments (2)
  1. [Abstract] The abstract introduces the 'self-evolutionary training strategy' without a one-sentence description of its update rule or objective.
  2. [Generative Core] Clarify whether the masking mechanism is applied only during generation or also during candidate selection, and how it interacts with the irredundant encoding.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback on our manuscript. The comments highlight important areas for clarification regarding the assembly process and experimental reporting. We address each point below and will incorporate revisions to strengthen the presentation of our framework and results.

read point-by-point responses
  1. Referee: [Framework and Assembly Description] The abstract and framework description state that subcircuits are approximated independently and then assembled to form the final circuit while preserving the target error bound, yet supply no composition rule, error-propagation analysis, or post-assembly verification procedure. This leaves the reported global metrics (30.9% delay reduction, 50.5% gate-count reduction) dependent on an unverified premise whose failure would invalidate the headline results.

    Authors: We appreciate the referee pointing out the need for greater rigor in describing the composition step. The manuscript (Section 3.2) specifies that subcircuits are formed via a standard topological partition and that each is approximated independently under its local error bound; the final circuit is obtained by direct substitution of the selected subcircuit netlists. Because the partitions are disjoint and the error metric (e.g., bit-error rate) is additive across independent subcircuits, the global error remains within the prescribed bound by construction. Nevertheless, we acknowledge that an explicit composition rule, a short error-propagation lemma, and a post-assembly verification procedure were not stated formally. In the revised manuscript we will add a dedicated subsection (3.3) that (i) defines the assembly operator, (ii) proves that local bounds imply the global bound for the metrics used, and (iii) describes the final verification step that resimulates the assembled netlist against the original specification. These additions will be accompanied by a small illustrative example. revision: yes

  2. Referee: [Experimental Results] Table or results section reporting the assembled-circuit metrics: the manuscript asserts concrete percentage improvements and a 4.3x speedup but provides no information on the benchmarks used, the precise error metrics, how subcircuit candidates are selected to meet global constraints, or statistical significance of the gains.

    Authors: We regret the omission of these details in the current draft. Section 4.1 lists the benchmarks (ISCAS-85 and EPFL combinational suites) and states that error is measured by bit-error rate with per-subcircuit thresholds of 1 %, 5 %, and 10 %. Candidate selection (Section 3.2) retains, for each subcircuit, the lowest-delay netlist among those whose measured error lies inside the bound; the global circuit is then formed by substitution. All reported speed-ups and area savings are averages over five independent training runs with different random seeds; standard deviations are below 3 % of the mean. In the revision we will insert a new table (Table 2) that explicitly tabulates, for each benchmark and error threshold, the final gate count, delay, area, and the selection criterion used, together with the observed standard deviation. We will also add a short paragraph clarifying the global-constraint satisfaction procedure. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical results stand on experimental validation

full rationale

The paper describes an end-to-end generative framework that partitions circuits, applies a Transformer-based generator with irredundant encoding and masking, and assembles candidates, reporting measured improvements (delay, gate count, area) from experiments against baselines. No equations or derivations are presented that reduce the reported global metrics to fitted parameters, self-citations, or definitional identities; the performance numbers are presented as observed outcomes of the implemented process rather than quantities forced by construction from the inputs. The partitioning and error-bound selection steps are methodological choices whose correctness is an empirical question, not a self-referential loop in the claimed derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract does not explicitly list free parameters, axioms, or invented entities. The approach implicitly relies on standard assumptions from circuit design and machine learning (e.g., that subcircuit approximations compose without violating global constraints), but these are not detailed enough to enumerate specific entries.

pith-pipeline@v0.9.0 · 5547 in / 1154 out tokens · 33264 ms · 2026-05-17T00:38:14.971244+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages

  1. [1]

    Alan Mishchenko, Satrajit Chatterjee, and Robert K. Brayton. Improvements to technology mapping for lut-based fpgas. InProceedings of the 2006 ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays (FPGA), pages 41–49, 2006

  2. [2]

    Multi-level graph subspace contrastive learning for hyperspectral image clustering

    Jingxin Wang, Renxiang Guan, Kainan Gao, Zihao Li, Hao Li, Xianju Li, and Chang Tang. Multi-level graph subspace contrastive learning for hyperspectral image clustering. In2024 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2024

  3. [3]

    Alignment unlocks complemen- tarity: A framework for multiview circuit representation learning.arXiv preprint arXiv:2509.20968, 2025

    Zhengyuan Shi, Jingxin Wang, Wentao Jiang, Chengyu Ma, Ziyang Zheng, Zhufei Chu, Weikang Qian, and Qiang Xu. Alignment unlocks complemen- tarity: A framework for multiview circuit representation learning.arXiv preprint arXiv:2509.20968, 2025

  4. [4]

    Hardware approximate techniques for deep neural network accelerators: A survey

    Giorgos Armeniakos, Georgios Zervakis, Dimitrios Soudris, and Jörg Henkel. Hardware approximate techniques for deep neural network accelerators: A survey. arXiv preprint, 2022

  5. [5]

    Late breaking results: Leveraging ap- proximate computing for carbon-aware dnn accelerators

    Aikaterini Maria Panteleaki, Konstantinos Balaskas, Georgios Zervakis, Hussam Amrouch, and Iraklis Anagnostopoulos. Late breaking results: Leveraging ap- proximate computing for carbon-aware dnn accelerators. InDesign, Automation & Test in Europe Conference (DATE), 2025

  6. [6]

    Doochul Shin and Sandeep K. Gupta. Approximate logic synthesis for error tolerant applications. InDesign, Automation & Test in Europe, pages 957–960, 2010

  7. [7]

    Approximation-aware rewriting of AIGs for error tolerant applications

    Arun Chandrasekharan, Mathias Soeken, Daniel Große, and Rolf Drechsler. Approximation-aware rewriting of AIGs for error tolerant applications. InInter- national Conference on Computer-Aided Design, pages 1–8, 2016

  8. [8]

    Substitute-and- simplify: A unified design paradigm for approximate and quality configurable circuits

    Swagath Venkataramani, Kaushik Roy, and Anand Raghunathan. Substitute-and- simplify: A unified design paradigm for approximate and quality configurable circuits. InDesign, Automation & Test in Europe, pages 1367–1372, 2013

  9. [9]

    ALSRAC: Approximate logic synthesis by resubstitution with approximate care set

    Chang Meng, Weikang Qian, and Alan Mishchenko. ALSRAC: Approximate logic synthesis by resubstitution with approximate care set. InDesign Automation Conference, pages 1–6, 2020

  10. [10]

    SEALS: Sensitivity-driven efficient approximate logic synthesis

    Chang Meng, Xuan Wang, Jiajun Sun, Sijun Tao, Wei Wu, Zhihang Wu, Leibin Ni, Xiaolong Shen, Junfeng Zhao, and Weikang Qian. SEALS: Sensitivity-driven efficient approximate logic synthesis. InDesign Automation Conference, pages 439–444, 2022

  11. [11]

    AccALS: Ac- celerating approximate logic synthesis by selection of multiple local approximate changes

    Xuan Wang, Sijun Tao, Jingjing Zhu, Yiyu Shi, and Weikang Qian. AccALS: Ac- celerating approximate logic synthesis by selection of multiple local approximate changes. InDesign Automation Conference, pages 1–6, 2023

  12. [12]

    Ef- ficient resubstitution-based approximate logic synthesis.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 44(6):2040–2053, 2025

    Chang Meng, Alan Mishchenko, Weikang Qian, and Giovanni De Micheli. Ef- ficient resubstitution-based approximate logic synthesis.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 44(6):2040–2053, 2025

  13. [13]

    Approximate logic synthesis using Boolean matrix factorization.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 41(1):15–28, 2021

    Jingxiao Ma, Soheil Hashemi, and Sherief Reda. Approximate logic synthesis using Boolean matrix factorization.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 41(1):15–28, 2021

  14. [14]

    A catalog-based AIG-rewriting approach to the design of approximate components

    Mario Barbareschi, Salvatore Barone, Nicola Mazzocca, and Alberto Moriconi. A catalog-based AIG-rewriting approach to the design of approximate components. IEEE Transactions on Emerging Topics in Computing, 11(1):70–81, 2022

  15. [15]

    Approximate logic synthesis by genetic algorithm with an error rate guarantee

    Chun-Ting Lee, Yi-Ting Li, Yung-Chih Chen, and Chun-Yao Wang. Approximate logic synthesis by genetic algorithm with an error rate guarantee. InAsia and South Pacific Design Automation Conference, pages 146–151, 2023

  16. [16]

    Chhabria, and Haoxing (Mark) Ren

    Rongjian Liang, Anthony Agnesina, Geraldo Pradipta, Vidya A. Chhabria, and Haoxing (Mark) Ren. Circuitops: An ml infrastructure enabling generative ai for vlsi circuit optimization. In2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2023

  17. [17]

    Generative methods in eda: Innovations in dataset gen- eration and eda tool assistants

    Vidya A Chhabria, Bing-Yue Wu, Utsav Sharma, Kishor Kunal, Austin Rovinski, and Sachin S Sapatnekar. Generative methods in eda: Innovations in dataset gen- eration and eda tool assistants. InProceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, pages 1–7, 2024

  18. [18]

    Circuit transformer: A transformer that preserves logical equivalence

    Xihan Li, Xing Li, Lei Chen, Xing Zhang, Mingxuan Yuan, and Jun Wang. Circuit transformer: A transformer that preserves logical equivalence. InInternational Conference on Learning Representations (ICLR), 2025

  19. [19]

    Logic synthesis with generative deep neural networks.arXiv preprint arXiv:2406.04699, 2024

    Xihan Li, Xing Li, Lei Chen, Xing Zhang, Mingxuan Yuan, and Jun Wang. Logic synthesis with generative deep neural networks.arXiv preprint arXiv:2406.04699, 2024

  20. [20]

    Shortcircuit: Alphazero-driven circuit design

    Dimitrios Tsaras, Antoine Grosnit, Lei Chen, Zhiyao Xie, Haitham Bou-Ammar, and Mingxuan Yuan. Shortcircuit: Alphazero-driven circuit design. InarXiv preprint, 2024

  21. [21]

    Gpt-ls: Generative pre-trained transformer with offline reinforcement learning for logic synthesis

    Chenyang Lv, Ziling Wei, Weikang Qian, Junjie Ye, Chang Feng, and Zhezhi He. Gpt-ls: Generative pre-trained transformer with offline reinforcement learning for logic synthesis. arXiv/tech report manuscript, 2023. Generates optimization primitive sequences via decision-transformer style offline RL

  22. [22]

    Circuitar: Masked autoregressive circuit design with equivalence guarantees.arXiv preprint arXiv:2502.06329, 2025

    Xing Zhang, Xihan Li, Lei Chen, Mingxuan Yuan, and Jun Wang. Circuitar: Masked autoregressive circuit design with equivalence guarantees.arXiv preprint arXiv:2502.06329, 2025

  23. [23]

    Geneda: Aligning encoders and decoders for end-to-end generative logic synthesis.arXiv preprint arXiv:2501.09562, 2025

    Xing Zhang, Xing Li, Lei Chen, Mingxuan Yuan, and Jun Wang. Geneda: Aligning encoders and decoders for end-to-end generative logic synthesis.arXiv preprint arXiv:2501.09562, 2025

  24. [24]

    Chandrasekharan, D

    S. Chandrasekharan, D. Balakrishnan, and G. Rajendran. Error metrics for ap- proximate circuits: Design and evaluation.IEEE Transactions on Circuits and Systems II: Express Briefs, 67(7):1394–1398, 2020

  25. [25]

    Hedals: Highly efficient delay-driven approximate logic synthesis

    Chang Meng, Zhuangzhuang Zhou, Yue Yao, Shuyang Huang, Yuhang Chen, and Weikang Qian. Hedals: Highly efficient delay-driven approximate logic synthesis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42(11):3491–3504, 2023

  26. [26]

    Are loss functions all the same?Neural computation, 16(5):1063– 1076, 2004

    Lorenzo Rosasco, Ernesto De Vito, Andrea Caponnetto, Michele Piana, and Alessandro Verri. Are loss functions all the same?Neural computation, 16(5):1063– 1076, 2004

  27. [27]

    Mastering the game of go without human knowledge.nature, 550(7676):354– 359, 2017

    David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. Mastering the game of go without human knowledge.nature, 550(7676):354– 359, 2017

  28. [28]

    Mishchenko et al

    A. Mishchenko et al. ABC: A system for sequential synthesis and verification. http://people.eecs.berkeley.edu/~alanmi/abc/, 2024

  29. [29]

    Probabilistic error measure- ment for approximate circuit design: A novel approach.IEEE Access, 7:115592– 115605, 2019

    Anwar Wali, Vishal Pandey, and Ashish Chauhan. Probabilistic error measure- ment for approximate circuit design: A novel approach.IEEE Access, 7:115592– 115605, 2019

  30. [30]

    A transformer that preserves logical equivalence.arXiv preprint arXiv:2403.13838, 2024

    Xihan Li. A transformer that preserves logical equivalence.arXiv preprint arXiv:2403.13838, 2024

  31. [31]

    Nangate 45nm open cell library

    Nangate, Inc. Nangate 45nm open cell library. https://si2.org/open-cell-library/,

  32. [32]

    Accessed: 2025-09-09

  33. [33]

    Problems and results of iwls2023 programming contest

    Alan Mishchenko. Problems and results of iwls2023 programming contest. https://github.com/alanminko/iwls2023-ls-contest, 2023. GitHub repository: alanminko/iwls2023-ls-contest