Automatic Conversion from Flip-flop to 3-phase Latch-based Designs

Huimei Cheng; Peter A. Beerel; Yichen Gu

arxiv: 1906.10666 · v1 · pith:WWUZA3B6new · submitted 2019-06-25 · 💻 cs.AR

Automatic Conversion from Flip-flop to 3-phase Latch-based Designs

Huimei Cheng , Yichen Gu , Peter A. Beerel This is my paper

Pith reviewed 2026-05-25 15:44 UTC · model grok-4.3

classification 💻 cs.AR

keywords flip-flop to latch conversion3-phase latch designmaster-slave comparisonautomated RTL transformationlatch count reductionlow-power circuit designtiming preservation

0 comments

The pith

An automated flow converts flip-flop designs into 3-phase latch-based circuits that match master-slave performance while using fewer latches.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a new automated design flow that transforms conventional flip-flop RTL specifications into 3-phase latch-based implementations. Unlike earlier methods limited to master-slave pairs with two non-overlapping clocks, the 3-phase approach allows greater latch sharing while keeping the same clock speed and timing behavior. On standard benchmark suites the converted circuits show measurable reductions in latch count, total area, and power draw. The method therefore removes a key barrier that has kept latch-based styles from wider use in RTL flows.

Core claim

The central claim is that an automated conversion algorithm can map flip-flop circuits to 3-phase latch designs that preserve original functionality and timing yet require substantially fewer latches than the conventional master-slave conversion, yielding average savings of 21.3 percent in latch count, 5.8 percent in area, and 16.3 percent in power across ISCAS, CEP, and CPU benchmarks.

What carries the argument

The automated design flow that performs the flip-flop to 3-phase latch conversion by generating three non-overlapping clock phases and inserting latches accordingly.

If this is right

The resulting 3-phase circuits deliver identical performance to master-slave latch versions.
Latch count drops by an average of 21.3 percent relative to master-slave conversion.
Area decreases by 5.8 percent and power by 16.3 percent on the tested benchmarks.
The flow works on a variety of ISCAS, CEP, and CPU circuits without manual redesign.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the conversion proves reliable on larger industrial designs, existing flip-flop RTL libraries could be reused with lower power cost.
The 3-phase style may open new trade-offs between clock distribution complexity and storage element count in future low-power flows.
Verification tools would need to handle three-phase timing checks to make the method fully automatic end-to-end.

Load-bearing premise

The conversion algorithm preserves the original circuit's functionality and timing when it replaces flip-flops with 3-phase latches.

What would settle it

A converted 3-phase netlist that produces different outputs from the original flip-flop design on the same input vectors or violates the original timing constraints.

Figures

Figures reproduced from arXiv: 1906.10666 by Huimei Cheng, Peter A. Beerel, Yichen Gu.

**Figure 2.** Figure 2: Example non-linear pipeline requiring 4-phase clocking [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Enabled (a) to gated clock (b) transformation [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Duplicated clock gating logic for phase conversion [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: 3-phase clocks for modified retiming assignments. Further optimization is then triggered to optimize the sizes of gates in the retimed latch-based design. 5. EXPERIMENTAL RESULTS This section quantifies the benefits of the proposed conversion algorithm comparing the resulting 3-phase design to the original FF-based as well as traditional master-slave latch-based designs. The experiments rely on an indust… view at source ↗

read the original abstract

Latch-based designs have many benefits over their flip-flop based counterparts but have limited use partially because most RTL specifications are flop-centric and automatic conversion of FF to latch-based designs is challenging. Conventional conversion algorithms target master-slave latch-based designs with two non-overlapping clocks. This paper presents a novel automated design flow that converts flip-flop to 3-phase latch-based designs. The resulting circuits have the same performance as the master-slave based designs but require significantly less latches. Our experimental results demonstrate the potential for savings in the number of latches (21.3%), area (5.8%), and power (16.3%) on a variety of ISCAS, CEP, and CPU benchmark circuits, compared to the master-slave conversions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a working automated flow to convert flip-flop designs to 3-phase latches and reports clear savings versus master-slave versions on standard benchmarks.

read the letter

The main takeaway is a practical automated conversion from flip-flop RTL to 3-phase latch circuits. The designs keep the same performance as the usual two-phase master-slave approach but need fewer latches overall. The authors back this with average savings of 21.3% in latch count, 5.8% area, and 16.3% power across ISCAS, CEP, and CPU benchmarks compared to master-slave conversions. That is the concrete result worth noting first. What stands out as new is the choice of three-phase clocking instead of the conventional two non-overlapping phases, plus the claim that an automated flow can reach it without manual redesign. The paper does well by sticking to public benchmarks and giving direct head-to-head numbers rather than just theoretical arguments. The experimental setup appears reproducible enough on its face since it uses standard suites and reports percentage deltas against a clear baseline. The soft spot is that the abstract supplies almost no description of the conversion algorithm or the checks used to confirm functional and timing equivalence after the change. The full paper needs to show those steps in enough detail that a reader could follow or replicate the flow; if that section is thin or relies on unstated assumptions about synthesis tools or clock constraints, the savings become harder to trust or extend. No circular reasoning or hidden fitted parameters show up in the reported claims. This work is aimed at people doing VLSI design automation or latch-based low-power optimizations. A reader who needs concrete RTL transformation methods would get usable numbers and a different clocking angle to consider. I would bring it to a reading group focused on digital design flows. I would not cite it in my own work unless I were actively comparing latch styles. It deserves peer review because the central claim is testable against the method and the benchmarks are standard.

Referee Report

2 major / 0 minor

Summary. The paper presents a novel automated design flow that converts flip-flop based designs to 3-phase latch-based designs. It claims the resulting circuits achieve the same performance as master-slave latch designs while requiring significantly fewer latches, with experimental results on ISCAS, CEP, and CPU benchmarks demonstrating average savings of 21.3% in latches, 5.8% in area, and 16.3% in power compared to master-slave conversions.

Significance. If the conversion algorithm is shown to preserve functionality and timing, the work could meaningfully advance practical adoption of latch-based designs by automating conversion from common FF-centric RTL, potentially enabling area and power reductions without performance trade-offs on standard benchmarks.

major comments (2)

[Abstract] Abstract: the central claims of functional equivalence, timing preservation, and quantitative savings rest on an automated conversion algorithm whose description, correctness argument, and verification method are entirely absent from the provided text. This is load-bearing because the reported 21.3% latch reduction is only meaningful if the 3-phase designs are provably equivalent to the original FF designs.
[Experimental results] Experimental results section (implied by abstract): no details are given on the experimental setup, benchmark synthesis flow, timing analysis method, or any functional verification (simulation or formal) that would confirm the claimed performance parity and savings. Without these, the 5.8% area and 16.3% power figures cannot be assessed for reproducibility or validity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their review and for highlighting areas where the manuscript requires greater detail. We address the major comments point by point below and will revise the paper accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the central claims of functional equivalence, timing preservation, and quantitative savings rest on an automated conversion algorithm whose description, correctness argument, and verification method are entirely absent from the provided text. This is load-bearing because the reported 21.3% latch reduction is only meaningful if the 3-phase designs are provably equivalent to the original FF designs.

Authors: We acknowledge that the abstract does not contain the algorithm description or verification details. The full manuscript body includes the conversion algorithm, but to strengthen the paper we will revise the abstract to briefly outline the algorithm's approach to functional equivalence and timing preservation, and add explicit references to the correctness argument and verification method. revision: yes
Referee: [Experimental results] Experimental results section (implied by abstract): no details are given on the experimental setup, benchmark synthesis flow, timing analysis method, or any functional verification (simulation or formal) that would confirm the claimed performance parity and savings. Without these, the 5.8% area and 16.3% power figures cannot be assessed for reproducibility or validity.

Authors: We agree that the experimental results section is insufficiently detailed. The revised manuscript will expand this section to fully describe the synthesis flow (tools, libraries, and constraints), timing analysis method, area/power estimation approach, benchmark details, and the functional verification process (including simulation and any formal checks) used to confirm equivalence and performance parity. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper describes an algorithmic conversion flow from flip-flop to 3-phase latch designs and validates it via direct experimental comparison on standard ISCAS/CEP/CPU benchmarks, reporting measured savings in latches, area, and power. No equations, fitted parameters, or uniqueness theorems appear in the provided material; the central claims rest on the existence of the implemented flow and its benchmark outcomes rather than any self-referential reduction or self-citation chain. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no information on any free parameters, axioms, or invented entities in the method.

pith-pipeline@v0.9.0 · 5653 in / 1129 out tokens · 52090 ms · 2026-05-25T15:44:21.561813+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

novel automated design flow that converts flip-flop to 3-phase latch-based designs... ILP that minimizes latches and retiming to ensure no performance loss

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 1 internal anchor

[1]

One of two de- vices: edge-triggered ﬂip-ﬂops (FFs) or level-sensitive latches are typically used as synchronization and state storage

INTRODUCTION The growing use of portable/wireless electronic systems and Internet-of-Things (IoT) applications motivates the de- sire of smaller and more energy-eﬃcient designs in today’s very large scale integration (VLSI) circuits. One of two de- vices: edge-triggered ﬂip-ﬂops (FFs) or level-sensitive latches are typically used as synchronization and st...

work page
[2]

BACKGROUND The Sakallah, Mudge, and Okulotun (SMO) model [18] deﬁnes an optimal framework for multi-phase latch-based designs. It deﬁnes a k-phase clock as a collection of k peri- odic signals with a common cycle time and associated timing constraints, called the General System Timing Constraints arXiv:1906.10666v1 [cs.AR] 25 Jun 2019 (GSTC). The phases (...

work page internal anchor Pith review Pith/arXiv arXiv 1906
[3]

This section explores the implicit trade-oﬀs associated with these constraints and motivates our three-phase clocking approach

LATCH-BASED DESIGNS This paper’s goal is to convert an FF-based to latch-based design minimizing the number of latches based on a reason- able set of constraints. This section explores the implicit trade-oﬀs associated with these constraints and motivates our three-phase clocking approach. 3.1 Minimal Constraints There are two constraints we adopt that ar...

work page
[4]

The group of FFs converted to a single latch are assigned to clock phase p1

CONVERSION ALGORITHM Our conversion approach is to automatically decompose the FFs into two groups, ones that will be converted to back- to-back connected latches and ones that will be converted into a single latch. The group of FFs converted to a single latch are assigned to clock phase p1. The remaining FFs are converted to latches clocked by either p1 ...

work page
[5]

pi” program, ARM-M0 was running the “hello world

EXPERIMENTAL RESULTS This section quantiﬁes the beneﬁts of the proposed con- version algorithm comparing the resulting 3-phase design to the original FF-based as well as traditional master-slave latch-based designs. The experiments rely on an industrial 28-nm FDSOI CMOS cell library and a range of circuits that include, ISCAS89 benchmark circuits [19], CE...

work page 2019
[6]

CONCLUSIONS This paper presents an algorithm to automatically convert a FF-based design into a 3-phase latch-based design that uses an ILP to minimize the number of required latches. Our experimental synthesis results on a broad range of bench- mark circuits show signiﬁcant savings are possible in both area and power with practical computational run-times...

work page
[7]

Blue gene/L compute chip: Control, test, and bring-up infrastructure,

R. A. Haring, R. Bellofatto, A. A. Bright, P. G. Crumley, M. B. Dombrowa, S. M. Douskey, M. R. Ellavsky, B. Gopalsamy, D. Hoenicke, T. A. Liebsch, J. A. Marcella, and M. Ohmacht, “Blue gene/L compute chip: Control, test, and bring-up infrastructure,” IBM Journal of Research and Development, vol. 49, no. 2.3, pp. 289–301, March 2005

work page 2005
[8]

Low power latch based design with smart retiming,

K. Singh, H. Jiao, J. Huisken, H. Fatemi, and J. P. De Gyvez, “Low power latch based design with smart retiming,” in Quality Electronic Design (ISQED), International Symposium on . IEEE, 2018, pp. 329–334

work page 2018
[9]

Sub-threshold latch-based icyﬂex2 32-bit processor with wide supply range operation,

M. Pons, T. Le, C. Arm, D. S´ everac, J. Nagel, M. Morgan, and S. Emery, “Sub-threshold latch-based icyﬂex2 32-bit processor with wide supply range operation,” in 2016 46th European Solid-State Device Research Conference (ESSDERC), Sept 2016, pp. 33–36

work page 2016
[10]

The advantages of latch-based design under process variation,

A. P Hurst and R. K Brayton, “The advantages of latch-based design under process variation,” in Proceedings of the IWLS , 2006

work page 2006
[11]

Bubble razor: An architecture-independent approach to timing-error detection and correction,

M. Fojtik, D. Fick, Y. Kim, N. Pinckney, D. Harris, D. Blaauw, and D. Sylvester, “Bubble razor: An architecture-independent approach to timing-error detection and correction,” in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International. IEEE, 2012, pp. 488–490

work page 2012
[12]

Blade–a timing violation resilient asynchronous template,

D. Hand, M. T. Moreira, H.-H. Huang, D. Chen, F. Butzke, Z. Li, M. Gibiluka, M. Breuer, N. L. V. Calazans, and P. A. Beerel, “Blade–a timing violation resilient asynchronous template,” in ASYNC. IEEE, 2015, pp. 21–28

work page 2015
[13]

Low-power pulse-triggered ﬂip-ﬂop design based on a signal feed-through scheme,

J.-F. Lin, “Low-power pulse-triggered ﬂip-ﬂop design based on a signal feed-through scheme,” IEEE Transaction on Very Large Scale Integration (VLSI) Systems , vol. 22, no. 1, pp. 181–185, 2014

work page 2014
[14]

Statistical time borrowing for pulsed-latch circuit designs,

S. Paik, L.-e. Yu, and Y. Shin, “Statistical time borrowing for pulsed-latch circuit designs,” in Proceedings of the 2010 Asia and South Paciﬁc Design Automation Conference . IEEE Press, 2010, pp. 675–680

work page 2010
[15]

Multi-bit pulsed-latch based low power synchronous circuit design,

K. Singh, O. A. R. Rosas, H. Jiao, J. Huisken, and J. P. de Gyvez, “Multi-bit pulsed-latch based low power synchronous circuit design,” in Circuits and Systems (ISCAS), 2018 IEEE International Symposium on . IEEE, 2018, pp. 1–5

work page 2018
[16]

Short path padding with multiple-Vtcells for wide-pulsed-latch based circuits at ultra-low voltage,

Y. Ding, W. Jin, G. He, and W. He, “Short path padding with multiple-Vtcells for wide-pulsed-latch based circuits at ultra-low voltage,” in 2017 IEEE 12th International Conference on ASIC (ASICON) , Oct 2017, pp. 985–988

work page 2017
[17]

Pulsed-latch circuits: A new dimension in asic design,

Y. Shin and S. Paik, “Pulsed-latch circuits: A new dimension in asic design,” IEEE Design & Test of Computers, vol. 28, no. 6, pp. 50–57, 2011

work page 2011
[18]

Timing optimization by replacing ﬂip-ﬂops to latches,

K. Yoshikawa, Y. Hagihara, K. Kanamaru, Y. Nakamura, S. Inui, and T. Yoshimura, “Timing optimization by replacing ﬂip-ﬂops to latches,” in Proceedings of the Asia and South Paciﬁc Design Automation Conference . IEEE Press, 2004, pp. 186–191

work page 2004
[19]

Desynchronization: Synthesis of asynchronous circuits from synchronous speciﬁcations,

J. Cortadella, A. Kondratyev, L. Lavagno, and C. P. Sotiriou, “Desynchronization: Synthesis of asynchronous circuits from synchronous speciﬁcations,” IEEE Trans. on CAD, vol. 25, no. 10, pp. 1904–1921, 2006

work page 1904
[20]

Asynchronous design by conversion: Converting synchronous circuits into asynchronous ones,

A. Branover, R. Kol, and R. Ginosar, “Asynchronous design by conversion: Converting synchronous circuits into asynchronous ones,” in Proceedings of the conference on Design, Automation and Test in Europe-Volume 2 . IEEE Computer Society, 2004, pp. 870–875

work page 2004
[21]

Performance and area optimization of a bundled-data Intel processor through resynthesis,

A. Saifhashemi, D. Hand, P. A. Beerel, W. Koven, and H. Wang, “Performance and area optimization of a bundled-data Intel processor through resynthesis,” in ASYNC, May 2014, pp. 110–111

work page 2014
[22]

Challenges in building an open-source ﬂow from RTL to bundled-data design,

Y. Zhang, H. Cheng, D. Chen, H. Fu, S. Agarwal, M. Lin, and P. A. Beerel, “Challenges in building an open-source ﬂow from RTL to bundled-data design,” in Asynchronous Circuits and Systems (ASYNC), IEEE International Symposium on, 2018

work page 2018
[23]

Automatic retiming of two-phase latch-based resilient circuits,

H. Cheng, H.-L. Wang, M. Zhang, D. Hand, and P. A. Beerel, “Automatic retiming of two-phase latch-based resilient circuits,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , 2018

work page 2018
[24]

Optimal clocking of synchronous systems,

K. A. Sakallah, T. N. Mudge, and O. A. Olukotun, “Optimal clocking of synchronous systems,” in In ACM International Workshop on Timing Issues in the Speciﬁcation and Synthesis of Digital Systems , 1990, pp. 1–21

work page 1990
[25]

ISCAS89: International symposium on circuits and systems sequential benchmark. http://www.pld.ttu.ee/˜maksim/benchmarks/iscas89/verilog/

“ISCAS89: International symposium on circuits and systems sequential benchmark. http://www.pld.ttu.ee/˜maksim/benchmarks/iscas89/verilog/.”

work page
[26]

MIT-LL common evaluation platform (CEP),

“MIT-LL common evaluation platform (CEP),” https://github.com/mit-ll/CEP, available: 2019

work page 2019
[27]

Plasma CPU,

“Plasma CPU,” http://opencores.org/project,plasma, available: 2014

work page 2014
[28]

Rocket chip,

“Rocket chip,” https://github.com/freechipsproject/rocket-chip, available: 2016

work page 2016
[29]

ARM Cortex M0,

“ARM Cortex M0,” https://developer.arm.com/products/ processors/cortex-m/cortex-m0

work page
[30]

The case for retiming with explicit reset circuitry,

V. Singhal, S. Malik, and R. K. Brayton, “The case for retiming with explicit reset circuitry,” in Proceedings of the 1996 IEEE/ACM international conference on Computer-aided design. IEEE Computer Society, 1997, pp. 618–625

work page 1996
[31]

Gurobi optimizer reference manual,

L. Gurobi Optimization, “Gurobi optimizer reference manual,” 2018. [Online]. Available: http://www.gurobi.com

work page 2018
[32]

Chinnery and K

D. Chinnery and K. Keutzer, Closing the gap between ASIC & custom: tools and techniques for high-performance ASIC design. Springer Science & Business Media, 2002

work page 2002
[33]

Combinational proﬁles of sequential benchmark circuits,

F. Brglez, D. Bryan, and K. Kozminski, “Combinational proﬁles of sequential benchmark circuits,” in IEEE International Symposium on Circuits and Systems , 1989, pp. 1929–1934

work page 1989

[1] [1]

One of two de- vices: edge-triggered ﬂip-ﬂops (FFs) or level-sensitive latches are typically used as synchronization and state storage

INTRODUCTION The growing use of portable/wireless electronic systems and Internet-of-Things (IoT) applications motivates the de- sire of smaller and more energy-eﬃcient designs in today’s very large scale integration (VLSI) circuits. One of two de- vices: edge-triggered ﬂip-ﬂops (FFs) or level-sensitive latches are typically used as synchronization and st...

work page

[2] [2]

BACKGROUND The Sakallah, Mudge, and Okulotun (SMO) model [18] deﬁnes an optimal framework for multi-phase latch-based designs. It deﬁnes a k-phase clock as a collection of k peri- odic signals with a common cycle time and associated timing constraints, called the General System Timing Constraints arXiv:1906.10666v1 [cs.AR] 25 Jun 2019 (GSTC). The phases (...

work page internal anchor Pith review Pith/arXiv arXiv 1906

[3] [3]

This section explores the implicit trade-oﬀs associated with these constraints and motivates our three-phase clocking approach

LATCH-BASED DESIGNS This paper’s goal is to convert an FF-based to latch-based design minimizing the number of latches based on a reason- able set of constraints. This section explores the implicit trade-oﬀs associated with these constraints and motivates our three-phase clocking approach. 3.1 Minimal Constraints There are two constraints we adopt that ar...

work page

[4] [4]

The group of FFs converted to a single latch are assigned to clock phase p1

CONVERSION ALGORITHM Our conversion approach is to automatically decompose the FFs into two groups, ones that will be converted to back- to-back connected latches and ones that will be converted into a single latch. The group of FFs converted to a single latch are assigned to clock phase p1. The remaining FFs are converted to latches clocked by either p1 ...

work page

[5] [5]

pi” program, ARM-M0 was running the “hello world

EXPERIMENTAL RESULTS This section quantiﬁes the beneﬁts of the proposed con- version algorithm comparing the resulting 3-phase design to the original FF-based as well as traditional master-slave latch-based designs. The experiments rely on an industrial 28-nm FDSOI CMOS cell library and a range of circuits that include, ISCAS89 benchmark circuits [19], CE...

work page 2019

[6] [6]

CONCLUSIONS This paper presents an algorithm to automatically convert a FF-based design into a 3-phase latch-based design that uses an ILP to minimize the number of required latches. Our experimental synthesis results on a broad range of bench- mark circuits show signiﬁcant savings are possible in both area and power with practical computational run-times...

work page

[7] [7]

Blue gene/L compute chip: Control, test, and bring-up infrastructure,

R. A. Haring, R. Bellofatto, A. A. Bright, P. G. Crumley, M. B. Dombrowa, S. M. Douskey, M. R. Ellavsky, B. Gopalsamy, D. Hoenicke, T. A. Liebsch, J. A. Marcella, and M. Ohmacht, “Blue gene/L compute chip: Control, test, and bring-up infrastructure,” IBM Journal of Research and Development, vol. 49, no. 2.3, pp. 289–301, March 2005

work page 2005

[8] [8]

Low power latch based design with smart retiming,

K. Singh, H. Jiao, J. Huisken, H. Fatemi, and J. P. De Gyvez, “Low power latch based design with smart retiming,” in Quality Electronic Design (ISQED), International Symposium on . IEEE, 2018, pp. 329–334

work page 2018

[9] [9]

Sub-threshold latch-based icyﬂex2 32-bit processor with wide supply range operation,

M. Pons, T. Le, C. Arm, D. S´ everac, J. Nagel, M. Morgan, and S. Emery, “Sub-threshold latch-based icyﬂex2 32-bit processor with wide supply range operation,” in 2016 46th European Solid-State Device Research Conference (ESSDERC), Sept 2016, pp. 33–36

work page 2016

[10] [10]

The advantages of latch-based design under process variation,

A. P Hurst and R. K Brayton, “The advantages of latch-based design under process variation,” in Proceedings of the IWLS , 2006

work page 2006

[11] [11]

Bubble razor: An architecture-independent approach to timing-error detection and correction,

M. Fojtik, D. Fick, Y. Kim, N. Pinckney, D. Harris, D. Blaauw, and D. Sylvester, “Bubble razor: An architecture-independent approach to timing-error detection and correction,” in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International. IEEE, 2012, pp. 488–490

work page 2012

[12] [12]

Blade–a timing violation resilient asynchronous template,

D. Hand, M. T. Moreira, H.-H. Huang, D. Chen, F. Butzke, Z. Li, M. Gibiluka, M. Breuer, N. L. V. Calazans, and P. A. Beerel, “Blade–a timing violation resilient asynchronous template,” in ASYNC. IEEE, 2015, pp. 21–28

work page 2015

[13] [13]

Low-power pulse-triggered ﬂip-ﬂop design based on a signal feed-through scheme,

J.-F. Lin, “Low-power pulse-triggered ﬂip-ﬂop design based on a signal feed-through scheme,” IEEE Transaction on Very Large Scale Integration (VLSI) Systems , vol. 22, no. 1, pp. 181–185, 2014

work page 2014

[14] [14]

Statistical time borrowing for pulsed-latch circuit designs,

S. Paik, L.-e. Yu, and Y. Shin, “Statistical time borrowing for pulsed-latch circuit designs,” in Proceedings of the 2010 Asia and South Paciﬁc Design Automation Conference . IEEE Press, 2010, pp. 675–680

work page 2010

[15] [15]

Multi-bit pulsed-latch based low power synchronous circuit design,

K. Singh, O. A. R. Rosas, H. Jiao, J. Huisken, and J. P. de Gyvez, “Multi-bit pulsed-latch based low power synchronous circuit design,” in Circuits and Systems (ISCAS), 2018 IEEE International Symposium on . IEEE, 2018, pp. 1–5

work page 2018

[16] [16]

Short path padding with multiple-Vtcells for wide-pulsed-latch based circuits at ultra-low voltage,

Y. Ding, W. Jin, G. He, and W. He, “Short path padding with multiple-Vtcells for wide-pulsed-latch based circuits at ultra-low voltage,” in 2017 IEEE 12th International Conference on ASIC (ASICON) , Oct 2017, pp. 985–988

work page 2017

[17] [17]

Pulsed-latch circuits: A new dimension in asic design,

Y. Shin and S. Paik, “Pulsed-latch circuits: A new dimension in asic design,” IEEE Design & Test of Computers, vol. 28, no. 6, pp. 50–57, 2011

work page 2011

[18] [18]

Timing optimization by replacing ﬂip-ﬂops to latches,

K. Yoshikawa, Y. Hagihara, K. Kanamaru, Y. Nakamura, S. Inui, and T. Yoshimura, “Timing optimization by replacing ﬂip-ﬂops to latches,” in Proceedings of the Asia and South Paciﬁc Design Automation Conference . IEEE Press, 2004, pp. 186–191

work page 2004

[19] [19]

Desynchronization: Synthesis of asynchronous circuits from synchronous speciﬁcations,

J. Cortadella, A. Kondratyev, L. Lavagno, and C. P. Sotiriou, “Desynchronization: Synthesis of asynchronous circuits from synchronous speciﬁcations,” IEEE Trans. on CAD, vol. 25, no. 10, pp. 1904–1921, 2006

work page 1904

[20] [20]

Asynchronous design by conversion: Converting synchronous circuits into asynchronous ones,

A. Branover, R. Kol, and R. Ginosar, “Asynchronous design by conversion: Converting synchronous circuits into asynchronous ones,” in Proceedings of the conference on Design, Automation and Test in Europe-Volume 2 . IEEE Computer Society, 2004, pp. 870–875

work page 2004

[21] [21]

Performance and area optimization of a bundled-data Intel processor through resynthesis,

A. Saifhashemi, D. Hand, P. A. Beerel, W. Koven, and H. Wang, “Performance and area optimization of a bundled-data Intel processor through resynthesis,” in ASYNC, May 2014, pp. 110–111

work page 2014

[22] [22]

Challenges in building an open-source ﬂow from RTL to bundled-data design,

Y. Zhang, H. Cheng, D. Chen, H. Fu, S. Agarwal, M. Lin, and P. A. Beerel, “Challenges in building an open-source ﬂow from RTL to bundled-data design,” in Asynchronous Circuits and Systems (ASYNC), IEEE International Symposium on, 2018

work page 2018

[23] [23]

Automatic retiming of two-phase latch-based resilient circuits,

H. Cheng, H.-L. Wang, M. Zhang, D. Hand, and P. A. Beerel, “Automatic retiming of two-phase latch-based resilient circuits,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems , 2018

work page 2018

[24] [24]

Optimal clocking of synchronous systems,

K. A. Sakallah, T. N. Mudge, and O. A. Olukotun, “Optimal clocking of synchronous systems,” in In ACM International Workshop on Timing Issues in the Speciﬁcation and Synthesis of Digital Systems , 1990, pp. 1–21

work page 1990

[25] [25]

ISCAS89: International symposium on circuits and systems sequential benchmark. http://www.pld.ttu.ee/˜maksim/benchmarks/iscas89/verilog/

“ISCAS89: International symposium on circuits and systems sequential benchmark. http://www.pld.ttu.ee/˜maksim/benchmarks/iscas89/verilog/.”

work page

[26] [26]

MIT-LL common evaluation platform (CEP),

“MIT-LL common evaluation platform (CEP),” https://github.com/mit-ll/CEP, available: 2019

work page 2019

[27] [27]

Plasma CPU,

“Plasma CPU,” http://opencores.org/project,plasma, available: 2014

work page 2014

[28] [28]

Rocket chip,

“Rocket chip,” https://github.com/freechipsproject/rocket-chip, available: 2016

work page 2016

[29] [29]

ARM Cortex M0,

“ARM Cortex M0,” https://developer.arm.com/products/ processors/cortex-m/cortex-m0

work page

[30] [30]

The case for retiming with explicit reset circuitry,

V. Singhal, S. Malik, and R. K. Brayton, “The case for retiming with explicit reset circuitry,” in Proceedings of the 1996 IEEE/ACM international conference on Computer-aided design. IEEE Computer Society, 1997, pp. 618–625

work page 1996

[31] [31]

Gurobi optimizer reference manual,

L. Gurobi Optimization, “Gurobi optimizer reference manual,” 2018. [Online]. Available: http://www.gurobi.com

work page 2018

[32] [32]

Chinnery and K

D. Chinnery and K. Keutzer, Closing the gap between ASIC & custom: tools and techniques for high-performance ASIC design. Springer Science & Business Media, 2002

work page 2002

[33] [33]

Combinational proﬁles of sequential benchmark circuits,

F. Brglez, D. Bryan, and K. Kozminski, “Combinational proﬁles of sequential benchmark circuits,” in IEEE International Symposium on Circuits and Systems , 1989, pp. 1929–1934

work page 1989