Automatic Conversion from Flip-flop to 3-phase Latch-based Designs
Pith reviewed 2026-05-25 15:44 UTC · model grok-4.3
The pith
An automated flow converts flip-flop designs into 3-phase latch-based circuits that match master-slave performance while using fewer latches.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that an automated conversion algorithm can map flip-flop circuits to 3-phase latch designs that preserve original functionality and timing yet require substantially fewer latches than the conventional master-slave conversion, yielding average savings of 21.3 percent in latch count, 5.8 percent in area, and 16.3 percent in power across ISCAS, CEP, and CPU benchmarks.
What carries the argument
The automated design flow that performs the flip-flop to 3-phase latch conversion by generating three non-overlapping clock phases and inserting latches accordingly.
If this is right
- The resulting 3-phase circuits deliver identical performance to master-slave latch versions.
- Latch count drops by an average of 21.3 percent relative to master-slave conversion.
- Area decreases by 5.8 percent and power by 16.3 percent on the tested benchmarks.
- The flow works on a variety of ISCAS, CEP, and CPU circuits without manual redesign.
Where Pith is reading between the lines
- If the conversion proves reliable on larger industrial designs, existing flip-flop RTL libraries could be reused with lower power cost.
- The 3-phase style may open new trade-offs between clock distribution complexity and storage element count in future low-power flows.
- Verification tools would need to handle three-phase timing checks to make the method fully automatic end-to-end.
Load-bearing premise
The conversion algorithm preserves the original circuit's functionality and timing when it replaces flip-flops with 3-phase latches.
What would settle it
A converted 3-phase netlist that produces different outputs from the original flip-flop design on the same input vectors or violates the original timing constraints.
Figures
read the original abstract
Latch-based designs have many benefits over their flip-flop based counterparts but have limited use partially because most RTL specifications are flop-centric and automatic conversion of FF to latch-based designs is challenging. Conventional conversion algorithms target master-slave latch-based designs with two non-overlapping clocks. This paper presents a novel automated design flow that converts flip-flop to 3-phase latch-based designs. The resulting circuits have the same performance as the master-slave based designs but require significantly less latches. Our experimental results demonstrate the potential for savings in the number of latches (21.3%), area (5.8%), and power (16.3%) on a variety of ISCAS, CEP, and CPU benchmark circuits, compared to the master-slave conversions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a novel automated design flow that converts flip-flop based designs to 3-phase latch-based designs. It claims the resulting circuits achieve the same performance as master-slave latch designs while requiring significantly fewer latches, with experimental results on ISCAS, CEP, and CPU benchmarks demonstrating average savings of 21.3% in latches, 5.8% in area, and 16.3% in power compared to master-slave conversions.
Significance. If the conversion algorithm is shown to preserve functionality and timing, the work could meaningfully advance practical adoption of latch-based designs by automating conversion from common FF-centric RTL, potentially enabling area and power reductions without performance trade-offs on standard benchmarks.
major comments (2)
- [Abstract] Abstract: the central claims of functional equivalence, timing preservation, and quantitative savings rest on an automated conversion algorithm whose description, correctness argument, and verification method are entirely absent from the provided text. This is load-bearing because the reported 21.3% latch reduction is only meaningful if the 3-phase designs are provably equivalent to the original FF designs.
- [Experimental results] Experimental results section (implied by abstract): no details are given on the experimental setup, benchmark synthesis flow, timing analysis method, or any functional verification (simulation or formal) that would confirm the claimed performance parity and savings. Without these, the 5.8% area and 16.3% power figures cannot be assessed for reproducibility or validity.
Simulated Author's Rebuttal
We thank the referee for their review and for highlighting areas where the manuscript requires greater detail. We address the major comments point by point below and will revise the paper accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claims of functional equivalence, timing preservation, and quantitative savings rest on an automated conversion algorithm whose description, correctness argument, and verification method are entirely absent from the provided text. This is load-bearing because the reported 21.3% latch reduction is only meaningful if the 3-phase designs are provably equivalent to the original FF designs.
Authors: We acknowledge that the abstract does not contain the algorithm description or verification details. The full manuscript body includes the conversion algorithm, but to strengthen the paper we will revise the abstract to briefly outline the algorithm's approach to functional equivalence and timing preservation, and add explicit references to the correctness argument and verification method. revision: yes
-
Referee: [Experimental results] Experimental results section (implied by abstract): no details are given on the experimental setup, benchmark synthesis flow, timing analysis method, or any functional verification (simulation or formal) that would confirm the claimed performance parity and savings. Without these, the 5.8% area and 16.3% power figures cannot be assessed for reproducibility or validity.
Authors: We agree that the experimental results section is insufficiently detailed. The revised manuscript will expand this section to fully describe the synthesis flow (tools, libraries, and constraints), timing analysis method, area/power estimation approach, benchmark details, and the functional verification process (including simulation and any formal checks) used to confirm equivalence and performance parity. revision: yes
Circularity Check
No significant circularity
full rationale
The paper describes an algorithmic conversion flow from flip-flop to 3-phase latch designs and validates it via direct experimental comparison on standard ISCAS/CEP/CPU benchmarks, reporting measured savings in latches, area, and power. No equations, fitted parameters, or uniqueness theorems appear in the provided material; the central claims rest on the existence of the implemented flow and its benchmark outcomes rather than any self-referential reduction or self-citation chain. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
novel automated design flow that converts flip-flop to 3-phase latch-based designs... ILP that minimizes latches and retiming to ensure no performance loss
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
INTRODUCTION The growing use of portable/wireless electronic systems and Internet-of-Things (IoT) applications motivates the de- sire of smaller and more energy-efficient designs in today’s very large scale integration (VLSI) circuits. One of two de- vices: edge-triggered flip-flops (FFs) or level-sensitive latches are typically used as synchronization and st...
-
[2]
BACKGROUND The Sakallah, Mudge, and Okulotun (SMO) model [18] defines an optimal framework for multi-phase latch-based designs. It defines a k-phase clock as a collection of k peri- odic signals with a common cycle time and associated timing constraints, called the General System Timing Constraints arXiv:1906.10666v1 [cs.AR] 25 Jun 2019 (GSTC). The phases (...
work page internal anchor Pith review Pith/arXiv arXiv 1906
-
[3]
LATCH-BASED DESIGNS This paper’s goal is to convert an FF-based to latch-based design minimizing the number of latches based on a reason- able set of constraints. This section explores the implicit trade-offs associated with these constraints and motivates our three-phase clocking approach. 3.1 Minimal Constraints There are two constraints we adopt that ar...
-
[4]
The group of FFs converted to a single latch are assigned to clock phase p1
CONVERSION ALGORITHM Our conversion approach is to automatically decompose the FFs into two groups, ones that will be converted to back- to-back connected latches and ones that will be converted into a single latch. The group of FFs converted to a single latch are assigned to clock phase p1. The remaining FFs are converted to latches clocked by either p1 ...
-
[5]
pi” program, ARM-M0 was running the “hello world
EXPERIMENTAL RESULTS This section quantifies the benefits of the proposed con- version algorithm comparing the resulting 3-phase design to the original FF-based as well as traditional master-slave latch-based designs. The experiments rely on an industrial 28-nm FDSOI CMOS cell library and a range of circuits that include, ISCAS89 benchmark circuits [19], CE...
work page 2019
-
[6]
CONCLUSIONS This paper presents an algorithm to automatically convert a FF-based design into a 3-phase latch-based design that uses an ILP to minimize the number of required latches. Our experimental synthesis results on a broad range of bench- mark circuits show significant savings are possible in both area and power with practical computational run-times...
-
[7]
Blue gene/L compute chip: Control, test, and bring-up infrastructure,
R. A. Haring, R. Bellofatto, A. A. Bright, P. G. Crumley, M. B. Dombrowa, S. M. Douskey, M. R. Ellavsky, B. Gopalsamy, D. Hoenicke, T. A. Liebsch, J. A. Marcella, and M. Ohmacht, “Blue gene/L compute chip: Control, test, and bring-up infrastructure,” IBM Journal of Research and Development, vol. 49, no. 2.3, pp. 289–301, March 2005
work page 2005
-
[8]
Low power latch based design with smart retiming,
K. Singh, H. Jiao, J. Huisken, H. Fatemi, and J. P. De Gyvez, “Low power latch based design with smart retiming,” in Quality Electronic Design (ISQED), International Symposium on . IEEE, 2018, pp. 329–334
work page 2018
-
[9]
Sub-threshold latch-based icyflex2 32-bit processor with wide supply range operation,
M. Pons, T. Le, C. Arm, D. S´ everac, J. Nagel, M. Morgan, and S. Emery, “Sub-threshold latch-based icyflex2 32-bit processor with wide supply range operation,” in 2016 46th European Solid-State Device Research Conference (ESSDERC), Sept 2016, pp. 33–36
work page 2016
-
[10]
The advantages of latch-based design under process variation,
A. P Hurst and R. K Brayton, “The advantages of latch-based design under process variation,” in Proceedings of the IWLS , 2006
work page 2006
-
[11]
Bubble razor: An architecture-independent approach to timing-error detection and correction,
M. Fojtik, D. Fick, Y. Kim, N. Pinckney, D. Harris, D. Blaauw, and D. Sylvester, “Bubble razor: An architecture-independent approach to timing-error detection and correction,” in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International. IEEE, 2012, pp. 488–490
work page 2012
-
[12]
Blade–a timing violation resilient asynchronous template,
D. Hand, M. T. Moreira, H.-H. Huang, D. Chen, F. Butzke, Z. Li, M. Gibiluka, M. Breuer, N. L. V. Calazans, and P. A. Beerel, “Blade–a timing violation resilient asynchronous template,” in ASYNC. IEEE, 2015, pp. 21–28
work page 2015
-
[13]
Low-power pulse-triggered flip-flop design based on a signal feed-through scheme,
J.-F. Lin, “Low-power pulse-triggered flip-flop design based on a signal feed-through scheme,” IEEE Transaction on Very Large Scale Integration (VLSI) Systems , vol. 22, no. 1, pp. 181–185, 2014
work page 2014
-
[14]
Statistical time borrowing for pulsed-latch circuit designs,
S. Paik, L.-e. Yu, and Y. Shin, “Statistical time borrowing for pulsed-latch circuit designs,” in Proceedings of the 2010 Asia and South Pacific Design Automation Conference . IEEE Press, 2010, pp. 675–680
work page 2010
-
[15]
Multi-bit pulsed-latch based low power synchronous circuit design,
K. Singh, O. A. R. Rosas, H. Jiao, J. Huisken, and J. P. de Gyvez, “Multi-bit pulsed-latch based low power synchronous circuit design,” in Circuits and Systems (ISCAS), 2018 IEEE International Symposium on . IEEE, 2018, pp. 1–5
work page 2018
-
[16]
Short path padding with multiple-Vtcells for wide-pulsed-latch based circuits at ultra-low voltage,
Y. Ding, W. Jin, G. He, and W. He, “Short path padding with multiple-Vtcells for wide-pulsed-latch based circuits at ultra-low voltage,” in 2017 IEEE 12th International Conference on ASIC (ASICON) , Oct 2017, pp. 985–988
work page 2017
-
[17]
Pulsed-latch circuits: A new dimension in asic design,
Y. Shin and S. Paik, “Pulsed-latch circuits: A new dimension in asic design,” IEEE Design & Test of Computers, vol. 28, no. 6, pp. 50–57, 2011
work page 2011
-
[18]
Timing optimization by replacing flip-flops to latches,
K. Yoshikawa, Y. Hagihara, K. Kanamaru, Y. Nakamura, S. Inui, and T. Yoshimura, “Timing optimization by replacing flip-flops to latches,” in Proceedings of the Asia and South Pacific Design Automation Conference . IEEE Press, 2004, pp. 186–191
work page 2004
-
[19]
Desynchronization: Synthesis of asynchronous circuits from synchronous specifications,
J. Cortadella, A. Kondratyev, L. Lavagno, and C. P. Sotiriou, “Desynchronization: Synthesis of asynchronous circuits from synchronous specifications,” IEEE Trans. on CAD, vol. 25, no. 10, pp. 1904–1921, 2006
work page 1904
-
[20]
Asynchronous design by conversion: Converting synchronous circuits into asynchronous ones,
A. Branover, R. Kol, and R. Ginosar, “Asynchronous design by conversion: Converting synchronous circuits into asynchronous ones,” in Proceedings of the conference on Design, Automation and Test in Europe-Volume 2 . IEEE Computer Society, 2004, pp. 870–875
work page 2004
-
[21]
Performance and area optimization of a bundled-data Intel processor through resynthesis,
A. Saifhashemi, D. Hand, P. A. Beerel, W. Koven, and H. Wang, “Performance and area optimization of a bundled-data Intel processor through resynthesis,” in ASYNC, May 2014, pp. 110–111
work page 2014
-
[22]
Challenges in building an open-source flow from RTL to bundled-data design,
Y. Zhang, H. Cheng, D. Chen, H. Fu, S. Agarwal, M. Lin, and P. A. Beerel, “Challenges in building an open-source flow from RTL to bundled-data design,” in Asynchronous Circuits and Systems (ASYNC), IEEE International Symposium on, 2018
work page 2018
-
[23]
Automatic retiming of two-phase latch-based resilient circuits,
H. Cheng, H.-L. Wang, M. Zhang, D. Hand, and P. A. Beerel, “Automatic retiming of two-phase latch-based resilient circuits,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , 2018
work page 2018
-
[24]
Optimal clocking of synchronous systems,
K. A. Sakallah, T. N. Mudge, and O. A. Olukotun, “Optimal clocking of synchronous systems,” in In ACM International Workshop on Timing Issues in the Specification and Synthesis of Digital Systems , 1990, pp. 1–21
work page 1990
-
[25]
“ISCAS89: International symposium on circuits and systems sequential benchmark. http://www.pld.ttu.ee/˜maksim/benchmarks/iscas89/verilog/.”
-
[26]
MIT-LL common evaluation platform (CEP),
“MIT-LL common evaluation platform (CEP),” https://github.com/mit-ll/CEP, available: 2019
work page 2019
- [27]
-
[28]
“Rocket chip,” https://github.com/freechipsproject/rocket-chip, available: 2016
work page 2016
-
[29]
“ARM Cortex M0,” https://developer.arm.com/products/ processors/cortex-m/cortex-m0
-
[30]
The case for retiming with explicit reset circuitry,
V. Singhal, S. Malik, and R. K. Brayton, “The case for retiming with explicit reset circuitry,” in Proceedings of the 1996 IEEE/ACM international conference on Computer-aided design. IEEE Computer Society, 1997, pp. 618–625
work page 1996
-
[31]
Gurobi optimizer reference manual,
L. Gurobi Optimization, “Gurobi optimizer reference manual,” 2018. [Online]. Available: http://www.gurobi.com
work page 2018
-
[32]
D. Chinnery and K. Keutzer, Closing the gap between ASIC & custom: tools and techniques for high-performance ASIC design. Springer Science & Business Media, 2002
work page 2002
-
[33]
Combinational profiles of sequential benchmark circuits,
F. Brglez, D. Bryan, and K. Kozminski, “Combinational profiles of sequential benchmark circuits,” in IEEE International Symposium on Circuits and Systems , 1989, pp. 1929–1934
work page 1989
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.