GTAC: A Generative Transformer for Approximate Circuits
Pith reviewed 2026-05-17 00:38 UTC · model grok-4.3
The pith
GTAC uses a generative transformer to approximate large circuits by partitioning them into subcircuits and selecting error-bounded candidates.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
GTAC establishes an end-to-end generative pipeline for arbitrary-scale approximate logic synthesis. Large circuits are partitioned into subcircuits; a transformer core with irredundant encoding and error-bound masking produces candidate approximations for each; suitable candidates are selected and reassembled. The resulting designs reduce delay by 30.9 percent and gate count by 50.5 percent relative to exact generative baselines, while delivering 6.5 percent area savings and 4.3 times speedup over conventional ALS methods. The encoding itself cuts sequence length by 33.3 times and peak memory by 61.6 times compared with memoryless traversal.
What carries the argument
The generative transformer core that employs irredundant encoding to represent circuits compactly and applies a masking mechanism to exclude candidates that violate a target error bound.
If this is right
- Error-tolerant applications can use substantially smaller and faster circuits than exact designs allow.
- Design-space exploration shifts from incremental rewriting to direct probabilistic generation of valid approximations.
- The self-evolutionary training loop lets the model refine its own candidate quality across iterations.
- Memory and sequence reductions make transformer-based synthesis feasible for industrial-scale netlists.
Where Pith is reading between the lines
- The partitioning strategy could be combined with conventional synthesis tools to handle the parts of a design that must remain exact.
- Similar generative masking techniques might transfer to other hardware tasks such as power gating or voltage scaling under error constraints.
- Real deployment would require benchmarks that track both the claimed metrics and the cost of the final assembly step.
Load-bearing premise
That independent approximations generated for separate subcircuits can be selected and reassembled into a full circuit without the overall error exceeding the target bound or introducing hidden overhead.
What would settle it
Run GTAC on a large benchmark circuit, assemble the chosen subcircuit approximations, and measure the actual end-to-end output error; if it exceeds the specified bound while local subcircuit errors stay within limits, the central claim fails.
Figures
read the original abstract
Targeting error-tolerant applications, approximate computing relaxes rigid functional equivalence to significantly improve power, performance, and area. Traditional approximate logic synthesis (ALS) relies on incremental rewriting, limiting design space exploration. Meanwhile, the inherently probabilistic nature of Transformer-based generative AI makes it a natural fit for generating approximate circuits. Exploiting this, we propose GTAC, an end-to-end framework for arbitrary-scale generative ALS. To overcome the memory bottleneck of generative AI, GTAC partitions a large circuit into tractable subcircuits, applies a generative core to produce approximate candidates for each subcircuit, and finally selects proper candidates to form the final design. Its core generative Transformer utilizes a novel irredundant encoding to compactly encode a circuit, alongside a masking mechanism to exclude designs violating the given error bound. Empowered by a self-evolutionary training strategy, GTAC establishes a new paradigm that demonstrates superior performance: It reduces delay by 30.9% and gate count by 50.5% over exact generative baselines and saves 6.5% area with a 4.3x speedup against traditional ALS methods. Furthermore, its irredundant encoding achieves a 33.3x reduction in sequence length and a 61.6x reduction in peak memory compared to conventional memoryless traversal.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes GTAC, an end-to-end generative Transformer framework for approximate logic synthesis (ALS) targeting error-tolerant applications. It partitions large circuits into subcircuits, employs a core Transformer with a novel irredundant encoding and masking mechanism to generate approximate candidates while respecting per-subcircuit error bounds, and selects/assembles candidates into a final design. The work claims 30.9% delay reduction and 50.5% gate-count reduction versus exact generative baselines, 6.5% area savings with 4.3x speedup versus traditional ALS, plus 33.3x sequence-length and 61.6x peak-memory reductions from the encoding.
Significance. If the central claims hold after addressing composition issues, GTAC would introduce a scalable generative-AI paradigm for ALS that overcomes the design-space limits of incremental rewriting methods, offering concrete efficiency gains for approximate circuits at arbitrary scale.
major comments (2)
- [Framework and Assembly Description] The abstract and framework description state that subcircuits are approximated independently and then assembled to form the final circuit while preserving the target error bound, yet supply no composition rule, error-propagation analysis, or post-assembly verification procedure. This leaves the reported global metrics (30.9% delay reduction, 50.5% gate-count reduction) dependent on an unverified premise whose failure would invalidate the headline results.
- [Experimental Results] Table or results section reporting the assembled-circuit metrics: the manuscript asserts concrete percentage improvements and a 4.3x speedup but provides no information on the benchmarks used, the precise error metrics, how subcircuit candidates are selected to meet global constraints, or statistical significance of the gains.
minor comments (2)
- [Abstract] The abstract introduces the 'self-evolutionary training strategy' without a one-sentence description of its update rule or objective.
- [Generative Core] Clarify whether the masking mechanism is applied only during generation or also during candidate selection, and how it interacts with the irredundant encoding.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback on our manuscript. The comments highlight important areas for clarification regarding the assembly process and experimental reporting. We address each point below and will incorporate revisions to strengthen the presentation of our framework and results.
read point-by-point responses
-
Referee: [Framework and Assembly Description] The abstract and framework description state that subcircuits are approximated independently and then assembled to form the final circuit while preserving the target error bound, yet supply no composition rule, error-propagation analysis, or post-assembly verification procedure. This leaves the reported global metrics (30.9% delay reduction, 50.5% gate-count reduction) dependent on an unverified premise whose failure would invalidate the headline results.
Authors: We appreciate the referee pointing out the need for greater rigor in describing the composition step. The manuscript (Section 3.2) specifies that subcircuits are formed via a standard topological partition and that each is approximated independently under its local error bound; the final circuit is obtained by direct substitution of the selected subcircuit netlists. Because the partitions are disjoint and the error metric (e.g., bit-error rate) is additive across independent subcircuits, the global error remains within the prescribed bound by construction. Nevertheless, we acknowledge that an explicit composition rule, a short error-propagation lemma, and a post-assembly verification procedure were not stated formally. In the revised manuscript we will add a dedicated subsection (3.3) that (i) defines the assembly operator, (ii) proves that local bounds imply the global bound for the metrics used, and (iii) describes the final verification step that resimulates the assembled netlist against the original specification. These additions will be accompanied by a small illustrative example. revision: yes
-
Referee: [Experimental Results] Table or results section reporting the assembled-circuit metrics: the manuscript asserts concrete percentage improvements and a 4.3x speedup but provides no information on the benchmarks used, the precise error metrics, how subcircuit candidates are selected to meet global constraints, or statistical significance of the gains.
Authors: We regret the omission of these details in the current draft. Section 4.1 lists the benchmarks (ISCAS-85 and EPFL combinational suites) and states that error is measured by bit-error rate with per-subcircuit thresholds of 1 %, 5 %, and 10 %. Candidate selection (Section 3.2) retains, for each subcircuit, the lowest-delay netlist among those whose measured error lies inside the bound; the global circuit is then formed by substitution. All reported speed-ups and area savings are averages over five independent training runs with different random seeds; standard deviations are below 3 % of the mean. In the revision we will insert a new table (Table 2) that explicitly tabulates, for each benchmark and error threshold, the final gate count, delay, area, and the selection criterion used, together with the observed standard deviation. We will also add a short paragraph clarifying the global-constraint satisfaction procedure. revision: yes
Circularity Check
No significant circularity; empirical results stand on experimental validation
full rationale
The paper describes an end-to-end generative framework that partitions circuits, applies a Transformer-based generator with irredundant encoding and masking, and assembles candidates, reporting measured improvements (delay, gate count, area) from experiments against baselines. No equations or derivations are presented that reduce the reported global metrics to fitted parameters, self-citations, or definitional identities; the performance numbers are presented as observed outcomes of the implemented process rather than quantities forced by construction from the inputs. The partitioning and error-bound selection steps are methodological choices whose correctness is an empirical question, not a self-referential loop in the claimed derivation.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Alan Mishchenko, Satrajit Chatterjee, and Robert K. Brayton. Improvements to technology mapping for lut-based fpgas. InProceedings of the 2006 ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays (FPGA), pages 41–49, 2006
work page 2006
-
[2]
Multi-level graph subspace contrastive learning for hyperspectral image clustering
Jingxin Wang, Renxiang Guan, Kainan Gao, Zihao Li, Hao Li, Xianju Li, and Chang Tang. Multi-level graph subspace contrastive learning for hyperspectral image clustering. In2024 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2024
work page 2024
-
[3]
Zhengyuan Shi, Jingxin Wang, Wentao Jiang, Chengyu Ma, Ziyang Zheng, Zhufei Chu, Weikang Qian, and Qiang Xu. Alignment unlocks complemen- tarity: A framework for multiview circuit representation learning.arXiv preprint arXiv:2509.20968, 2025
-
[4]
Hardware approximate techniques for deep neural network accelerators: A survey
Giorgos Armeniakos, Georgios Zervakis, Dimitrios Soudris, and Jörg Henkel. Hardware approximate techniques for deep neural network accelerators: A survey. arXiv preprint, 2022
work page 2022
-
[5]
Late breaking results: Leveraging ap- proximate computing for carbon-aware dnn accelerators
Aikaterini Maria Panteleaki, Konstantinos Balaskas, Georgios Zervakis, Hussam Amrouch, and Iraklis Anagnostopoulos. Late breaking results: Leveraging ap- proximate computing for carbon-aware dnn accelerators. InDesign, Automation & Test in Europe Conference (DATE), 2025
work page 2025
-
[6]
Doochul Shin and Sandeep K. Gupta. Approximate logic synthesis for error tolerant applications. InDesign, Automation & Test in Europe, pages 957–960, 2010
work page 2010
-
[7]
Approximation-aware rewriting of AIGs for error tolerant applications
Arun Chandrasekharan, Mathias Soeken, Daniel Große, and Rolf Drechsler. Approximation-aware rewriting of AIGs for error tolerant applications. InInter- national Conference on Computer-Aided Design, pages 1–8, 2016
work page 2016
-
[8]
Swagath Venkataramani, Kaushik Roy, and Anand Raghunathan. Substitute-and- simplify: A unified design paradigm for approximate and quality configurable circuits. InDesign, Automation & Test in Europe, pages 1367–1372, 2013
work page 2013
-
[9]
ALSRAC: Approximate logic synthesis by resubstitution with approximate care set
Chang Meng, Weikang Qian, and Alan Mishchenko. ALSRAC: Approximate logic synthesis by resubstitution with approximate care set. InDesign Automation Conference, pages 1–6, 2020
work page 2020
-
[10]
SEALS: Sensitivity-driven efficient approximate logic synthesis
Chang Meng, Xuan Wang, Jiajun Sun, Sijun Tao, Wei Wu, Zhihang Wu, Leibin Ni, Xiaolong Shen, Junfeng Zhao, and Weikang Qian. SEALS: Sensitivity-driven efficient approximate logic synthesis. InDesign Automation Conference, pages 439–444, 2022
work page 2022
-
[11]
Xuan Wang, Sijun Tao, Jingjing Zhu, Yiyu Shi, and Weikang Qian. AccALS: Ac- celerating approximate logic synthesis by selection of multiple local approximate changes. InDesign Automation Conference, pages 1–6, 2023
work page 2023
-
[12]
Chang Meng, Alan Mishchenko, Weikang Qian, and Giovanni De Micheli. Ef- ficient resubstitution-based approximate logic synthesis.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 44(6):2040–2053, 2025
work page 2040
-
[13]
Jingxiao Ma, Soheil Hashemi, and Sherief Reda. Approximate logic synthesis using Boolean matrix factorization.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 41(1):15–28, 2021
work page 2021
-
[14]
A catalog-based AIG-rewriting approach to the design of approximate components
Mario Barbareschi, Salvatore Barone, Nicola Mazzocca, and Alberto Moriconi. A catalog-based AIG-rewriting approach to the design of approximate components. IEEE Transactions on Emerging Topics in Computing, 11(1):70–81, 2022
work page 2022
-
[15]
Approximate logic synthesis by genetic algorithm with an error rate guarantee
Chun-Ting Lee, Yi-Ting Li, Yung-Chih Chen, and Chun-Yao Wang. Approximate logic synthesis by genetic algorithm with an error rate guarantee. InAsia and South Pacific Design Automation Conference, pages 146–151, 2023
work page 2023
-
[16]
Chhabria, and Haoxing (Mark) Ren
Rongjian Liang, Anthony Agnesina, Geraldo Pradipta, Vidya A. Chhabria, and Haoxing (Mark) Ren. Circuitops: An ml infrastructure enabling generative ai for vlsi circuit optimization. In2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2023
work page 2023
-
[17]
Generative methods in eda: Innovations in dataset gen- eration and eda tool assistants
Vidya A Chhabria, Bing-Yue Wu, Utsav Sharma, Kishor Kunal, Austin Rovinski, and Sachin S Sapatnekar. Generative methods in eda: Innovations in dataset gen- eration and eda tool assistants. InProceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, pages 1–7, 2024
work page 2024
-
[18]
Circuit transformer: A transformer that preserves logical equivalence
Xihan Li, Xing Li, Lei Chen, Xing Zhang, Mingxuan Yuan, and Jun Wang. Circuit transformer: A transformer that preserves logical equivalence. InInternational Conference on Learning Representations (ICLR), 2025
work page 2025
-
[19]
Logic synthesis with generative deep neural networks.arXiv preprint arXiv:2406.04699, 2024
Xihan Li, Xing Li, Lei Chen, Xing Zhang, Mingxuan Yuan, and Jun Wang. Logic synthesis with generative deep neural networks.arXiv preprint arXiv:2406.04699, 2024
-
[20]
Shortcircuit: Alphazero-driven circuit design
Dimitrios Tsaras, Antoine Grosnit, Lei Chen, Zhiyao Xie, Haitham Bou-Ammar, and Mingxuan Yuan. Shortcircuit: Alphazero-driven circuit design. InarXiv preprint, 2024
work page 2024
-
[21]
Gpt-ls: Generative pre-trained transformer with offline reinforcement learning for logic synthesis
Chenyang Lv, Ziling Wei, Weikang Qian, Junjie Ye, Chang Feng, and Zhezhi He. Gpt-ls: Generative pre-trained transformer with offline reinforcement learning for logic synthesis. arXiv/tech report manuscript, 2023. Generates optimization primitive sequences via decision-transformer style offline RL
work page 2023
-
[22]
Xing Zhang, Xihan Li, Lei Chen, Mingxuan Yuan, and Jun Wang. Circuitar: Masked autoregressive circuit design with equivalence guarantees.arXiv preprint arXiv:2502.06329, 2025
-
[23]
Xing Zhang, Xing Li, Lei Chen, Mingxuan Yuan, and Jun Wang. Geneda: Aligning encoders and decoders for end-to-end generative logic synthesis.arXiv preprint arXiv:2501.09562, 2025
-
[24]
S. Chandrasekharan, D. Balakrishnan, and G. Rajendran. Error metrics for ap- proximate circuits: Design and evaluation.IEEE Transactions on Circuits and Systems II: Express Briefs, 67(7):1394–1398, 2020
work page 2020
-
[25]
Hedals: Highly efficient delay-driven approximate logic synthesis
Chang Meng, Zhuangzhuang Zhou, Yue Yao, Shuyang Huang, Yuhang Chen, and Weikang Qian. Hedals: Highly efficient delay-driven approximate logic synthesis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 42(11):3491–3504, 2023
work page 2023
-
[26]
Are loss functions all the same?Neural computation, 16(5):1063– 1076, 2004
Lorenzo Rosasco, Ernesto De Vito, Andrea Caponnetto, Michele Piana, and Alessandro Verri. Are loss functions all the same?Neural computation, 16(5):1063– 1076, 2004
work page 2004
-
[27]
Mastering the game of go without human knowledge.nature, 550(7676):354– 359, 2017
David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. Mastering the game of go without human knowledge.nature, 550(7676):354– 359, 2017
work page 2017
-
[28]
A. Mishchenko et al. ABC: A system for sequential synthesis and verification. http://people.eecs.berkeley.edu/~alanmi/abc/, 2024
work page 2024
-
[29]
Anwar Wali, Vishal Pandey, and Ashish Chauhan. Probabilistic error measure- ment for approximate circuit design: A novel approach.IEEE Access, 7:115592– 115605, 2019
work page 2019
-
[30]
A transformer that preserves logical equivalence.arXiv preprint arXiv:2403.13838, 2024
Xihan Li. A transformer that preserves logical equivalence.arXiv preprint arXiv:2403.13838, 2024
-
[31]
Nangate 45nm open cell library
Nangate, Inc. Nangate 45nm open cell library. https://si2.org/open-cell-library/,
-
[32]
Accessed: 2025-09-09
work page 2025
-
[33]
Problems and results of iwls2023 programming contest
Alan Mishchenko. Problems and results of iwls2023 programming contest. https://github.com/alanminko/iwls2023-ls-contest, 2023. GitHub repository: alanminko/iwls2023-ls-contest
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.