pith. sign in

arxiv: 2604.23658 · v1 · submitted 2026-04-26 · 💻 cs.AR · cs.AI· cs.LG

FlowPlace: Flow Matching for Chip Placement

Pith reviewed 2026-05-08 05:03 UTC · model grok-4.3

classification 💻 cs.AR cs.AIcs.LG
keywords chip placementflow matchinggenerative modelsphysical designEDAPPA optimizationoverlap-free layoutsmask-guided synthesis
0
0 comments X

The pith

FlowPlace applies flow matching to chip placement to deliver overlap-free layouts with superior PPA metrics and 10-50 times faster sampling than diffusion models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents FlowPlace as a generative approach that replaces diffusion models for the chip placement task in physical design. It generates training data through mask guidance rather than random synthesis, trains via flow matching that accepts flexible priors, and applies hard constraints during sampling to enforce validity. On standard OpenROAD and ICCAD 2015 benchmarks this produces placements with improved power-performance-area scores, eliminates overlaps, and reduces sampling time by one to two orders of magnitude.

Core claim

FlowPlace shows that flow matching, when paired with mask-guided synthetic data generation and hard-constraint sampling, can directly produce valid chip placements that improve PPA metrics while running 10-50 times faster than prior generative baselines and guaranteeing zero cell overlaps.

What carries the argument

Flow matching, which learns a continuous vector field that transports samples from a flexible prior distribution to the target distribution of legal placements.

If this is right

  • Placements satisfy hard legality constraints without post-processing or gradient-based repair steps.
  • Sampling time drops by 10-50x compared with diffusion-model baselines, enabling more placement candidates to be evaluated inside a fixed time budget.
  • PPA metrics improve on both OpenROAD and ICCAD 2015 suites relative to prior generative methods.
  • The same trained model can accept different prior distributions at inference time without retraining.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The hard-constraint sampling mechanism could be adapted to other geometric layout problems that require strict non-overlap rules.
  • Because flow matching supports flexible priors, the method might allow designers to inject domain-specific starting points such as partial manual floorplans.
  • Faster sampling opens the possibility of coupling the placer inside an inner optimization loop that jointly tunes cell sizing and routing congestion.

Load-bearing premise

That the combination of mask-guided data and hard-constraint sampling during inference will continue to yield high-quality, overlap-free placements when the method is applied to industrial-scale designs beyond the tested benchmarks.

What would settle it

Running FlowPlace on a new large-scale netlist and observing either cell overlaps greater than zero or PPA metrics worse than the best diffusion-based or traditional placers on that netlist.

Figures

Figures reproduced from arXiv: 2604.23658 by Chao Qian, Chengrui Gao, Chenjian Ding, Ke Xue, Mingxuan Yuan, Peng Xie, Ruo-Tong Chen, Siyuan Xu, Yunqi Shi.

Figure 1
Figure 1. Figure 1: Comparison of placement performance on su￾perblue7. EfficientPlace (RL-based) places macros sequen￾tially, while ChipDiffusion and FlowPlace (generative model￾based) simultaneously move all macro positions. FlowPlace demonstrates superior performance with the lowest HPWL and zero overlap. Their sequential decision-making nature also creates a bot￾tleneck, as early suboptimal choices have irreversible and c… view at source ↗
Figure 2
Figure 2. Figure 2: Progressive constraint enforcement process of hard constraint guided sampling. For clarity, we only show one movable macro 𝐴. 20-50 steps, compared to the requirement of 1000 steps in ChipDiffusion [17]. 3.3 Hard Constraint Guided Sampling While the learned flow model generates placements effi￾ciently, it does not inherently guarantee hard constraints such as non-overlap. ChipDiffusion [17] addresses this … view at source ↗
Figure 3
Figure 3. Figure 3: Placement layouts and congestion visualization on superblue1. Red points are the congestion critical regions. These results underscore FlowPlace’s generalization capabil￾ity and practical value in realistic design settings. 4.2 Additional Results Visualization Analysis view at source ↗
read the original abstract

Chip placement plays an important role in physical design. While generative models like diffusion models offer promising learning-based solutions, current methods have the following limitations: they use random synthetic data for pre-training, require long sampling times, and often result in overlaps due to their dependence on gradient-based solvers during the sampling process. To overcome these issues, we propose FlowPlace, which features mask-guided synthetic data generation, flow-based efficient training with flexible prior injection, and hard constraint sampling for overlap-free layouts. Experiments on OpenROAD and ICCAD 2015 benchmarks show FlowPlace achieves better PPA metrics, 10-50$\times$ faster sampling efficiency, and zero overlaps.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes FlowPlace, a flow-matching generative model for chip placement. It addresses limitations of diffusion models by introducing mask-guided synthetic data generation for pre-training, flow-based training with flexible prior injection, and hard-constraint sampling during inference to enforce overlap-free layouts. Experiments on OpenROAD and ICCAD 2015 benchmarks are reported to yield improved PPA metrics, 10-50× faster sampling, and zero overlaps relative to prior generative approaches.

Significance. If the empirical claims hold under scrutiny, the work could advance learning-based physical design by improving sampling efficiency and hard-constraint satisfaction in generative placement models. Flow matching combined with explicit constraint handling offers a promising direction for EDA automation, though its practical significance hinges on whether the method scales beyond the small academic benchmarks tested and preserves quality under industrial netlist complexity.

major comments (3)
  1. [§4 (Experiments)] §4 (Experiments): the central PPA, speed, and zero-overlap claims rest on benchmark results that lack error bars, ablation studies isolating the mask-guided data generation and hard-constraint components, and any description of how hard constraints are enforced inside the flow ODE without degrading the reported metrics or introducing instability.
  2. [§3 (Method)] §3 (Method): no equations, pseudocode, or derivation details are supplied for the hard-constraint sampling procedure or the mask-guided data generation process; without these it is impossible to verify that the claimed 10-50× sampling speedup is preserved once constraints are applied or that the approach is free of hidden fitting parameters.
  3. [§5 (Discussion/Conclusion)] §5 (Discussion/Conclusion): the generalization argument from OpenROAD/ICCAD 2015 benchmarks to industrial designs is unsupported; no evidence is given that the flow ODE remains stable or that constraint enforcement preserves quality when routing, timing, and heterogeneous constraints absent from the tested netlists are introduced.
minor comments (2)
  1. [Abstract and §2] Abstract and §2: the term 'flexible prior injection' is used without a concise definition or reference to the relevant flow-matching literature, which may hinder readers outside the immediate subfield.
  2. [Figures/Tables] Figure captions and tables: several result tables lack units or baseline method versions, making direct comparison of the 10-50× speedup claim difficult.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below with specific plans for revision where the manuscript can be strengthened, while providing honest clarification on points that require additional context from our work.

read point-by-point responses
  1. Referee: [§4 (Experiments)] the central PPA, speed, and zero-overlap claims rest on benchmark results that lack error bars, ablation studies isolating the mask-guided data generation and hard-constraint components, and any description of how hard constraints are enforced inside the flow ODE without degrading the reported metrics or introducing instability.

    Authors: We agree that the experimental section would benefit from greater statistical rigor and isolation of components. In the revised manuscript we will add error bars computed across multiple random seeds for all PPA, runtime, and overlap metrics. We will also include new ablation tables that separately disable mask-guided data generation and hard-constraint sampling to quantify their individual contributions. For the hard-constraint enforcement inside the flow ODE, we will add a dedicated paragraph in §4 explaining the post-integration projection step, together with empirical timing and stability measurements confirming that the 10-50× speedup is retained and no additional instability is introduced. revision: yes

  2. Referee: [§3 (Method)] no equations, pseudocode, or derivation details are supplied for the hard-constraint sampling procedure or the mask-guided data generation process; without these it is impossible to verify that the claimed 10-50× sampling speedup is preserved once constraints are applied or that the approach is free of hidden fitting parameters.

    Authors: We acknowledge the omission of formal details in the method section. The revised §3 will contain the complete mathematical formulation of the mask-guided data generation process (including the mask construction equations and the resulting training objective) as well as the derivation of flexible prior injection in the flow-matching framework. For hard-constraint sampling we will insert the full algorithm in pseudocode and a short derivation showing that the constraint is realized by a single non-iterative projection after each ODE step; this projection adds negligible overhead, thereby preserving the reported speedup and introducing no additional tunable parameters beyond those already stated. revision: yes

  3. Referee: [§5 (Discussion/Conclusion)] the generalization argument from OpenROAD/ICCAD 2015 benchmarks to industrial designs is unsupported; no evidence is given that the flow ODE remains stable or that constraint enforcement preserves quality when routing, timing, and heterogeneous constraints absent from the tested netlists are introduced.

    Authors: We agree that the current evaluation is limited to academic benchmarks and does not constitute direct evidence of industrial-scale generalization. In the revised Discussion we will explicitly delineate these limitations, discuss the potential impact of additional routing, timing, and heterogeneous constraints on ODE stability, and report any observed stability metrics from our existing experiments. Full validation on proprietary industrial netlists lies outside the scope of this work; we will therefore frame the generalization claim more cautiously while highlighting the method’s design choices that aim to support future scaling. revision: partial

Circularity Check

0 steps flagged

No circularity; empirical results on external benchmarks with no derivation chain

full rationale

The paper proposes FlowPlace as a flow-matching approach with mask-guided synthetic data, flexible prior injection, and hard-constraint sampling. All performance claims (better PPA, 10-50x faster sampling, zero overlaps) are presented as direct experimental outcomes on the OpenROAD and ICCAD 2015 benchmarks. No equations, first-principles derivations, fitted-parameter predictions, or self-citation chains are described that would reduce any result to its own inputs by construction. The method is therefore self-contained against external benchmarks rather than internally tautological.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that flow matching can be adapted to discrete placement constraints and that synthetic data generated with masks is sufficiently representative; no free parameters or invented entities are explicitly named in the abstract.

axioms (2)
  • domain assumption Flow matching can be trained on mask-guided synthetic placement data to produce valid layouts
    Invoked implicitly when claiming efficient training and overlap-free sampling
  • domain assumption Hard constraint sampling during inference enforces zero overlaps without harming PPA metrics
    Central to the zero-overlap claim but not derived or proven in the abstract

pith-pipeline@v0.9.0 · 5427 in / 1297 out tokens · 27053 ms · 2026-05-08T05:03:38.788365+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages

  1. [1]

    Anthony Agnesina, Puranjay Rajvanshi, Tian Yang, Geraldo Pradipta, Austin Jiao, Ben Keller, Brucek Khailany, and Haoxing Ren. 2023. Au- toDMP: Automated DREAMPlace-based macro placement. InProceed- ings of the 2023 International Symposium on Physical Design

  2. [2]

    Tutu Ajayi, Vidya A Chhabria, Mateus Fogaça, Soheil Hashemi, Abdel- rahman Hosny, Andrew B Kahng, Minsoo Kim, Jeongsup Lee, Uday Mallappa, Marina Neseem, et al. 2019. Toward an open-source digital flow: First learnings from the openroad project. InProceedings of 56th Design Automation Conference

  3. [3]

    Shaked Brody, Uri Alon, and Eran Yahav. 2022. How attentive are graph attention networks?. InProceedings of the 10th International Conference on Learning Representations

  4. [4]

    Yifan Chen, Zaiwen Wen, Yun Liang, and Yibo Lin. 2023. Stronger mixed-size placement backbone considering second-order information. InProceedings of the 42nd International Conference on Computer Aided Design

  5. [5]

    Maddix, Abdul Fatir Ansari, Andrew Stuart, Michael W

    Chaoran Cheng, Boran Han, Danielle C. Maddix, Abdul Fatir Ansari, Andrew Stuart, Michael W. Mahoney, and Bernie Wang. 2025. Gradient- free generation for hard-constrained systems. InProceedings of the 13th International Conference on Learning Representations

  6. [6]

    Chung-Kuan Cheng, Andrew B Kahng, Ilgweon Kang, and Lutong Wang. 2018. Replace: Advancing solution quality and routability validation in global placement.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems38, 9 (2018), 1717–1730

  7. [7]

    Chung-Kuan Cheng, Andrew B Kahng, Sayak Kundu, Yucheng Wang, and Zhiang Wang. 2023. Assessment of reinforcement learning for macro placement. InProceedings of the 2023 International Symposium on Physical Design

  8. [8]

    Ruoyu Cheng and Junchi Yan. 2021. On joint learning for solving place- ment and routing in chip design. InAdvances in Neural Information Processing Systems 34

  9. [9]

    Zijie Geng, Jie Wang, Ziyan Liu, Siyuan Xu, Zhentao Tang, Mingxuan Yuan, Jianye Hao, Yongdong Zhang, and Feng Wu. 2024. Reinforcement learning within tree search for fast macro placement. InProceedings of the 41st International Conference on Machine Learning

  10. [10]

    Anna Goldie, Azalia Mirhoseini, and Jeff Dean. 2024. That chip has sailed: A critique of unfounded skepticism around AI for chip design. arxiv:2411.10053(2024)

  11. [11]

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. InAdvances in Neural Information Processing Systems 33

  12. [12]

    Xianlong Hong, Gang Huang, Yici Cai, Jiangchun Gu, Sheqin Dong, Chung-Kuan Cheng, and Jun Gu. 2000. Corner block list: An effective and efficient topological representation of non-slicing floorplan. In Proceedings of the 13th International Conference on Computer Aided Design

  13. [13]

    Andrew B Kahng, Ravi Varadarajan, and Zhiang Wang. 2023. Hier- RTLMP: A hierarchical automatic macro placer for large-scale complex IP blocks.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems43, 5 (2023)

  14. [14]

    Myung-Chul Kim, Jin Hu, Jiajia Li, and Natarajan Viswanathan. 2015. ICCAD-2015 CAD contest in incremental timing-driven placement and benchmark suite. InProceedings of the 34th International Conference on Computer-Aided Design

  15. [15]

    Yao Lai, Jinxin Liu, Zhentao Tang, Bin Wang, Jianye Hao, and Ping Luo

  16. [16]

    InProceedings of the 40th International Conference on Machine Learning

    ChiPFormer: Transferable chip placement via offline decision transformer. InProceedings of the 40th International Conference on Machine Learning

  17. [17]

    Yao Lai, Yao Mu, and Ping Luo. 2022. MaskPlace: Fast chip placement via reinforced visual representation learning. InAdvances in Neural Information Processing Systems 35

  18. [18]

    Vint Lee, Minh Nguyen, Leena Elzeiny, Chun Deng, Pieter Abbeel, and John Wawrzynek. 2025. Chip placement with diffusion models. In Proceedings of the 42nd International Conference on Machine Learning

  19. [19]

    Peiyu Liao, Dawei Guo, Zizheng Guo, Siting Liu, Yibo Lin, and Bei Yu

  20. [20]

    DREAMPlace 4.0: Timing-driven placement with momentum- based net weighting and lagrangian-based refinement.IEEE Transac- tions on Computer-Aided Design of Integrated Circuits and Systems42, 10 (2023), 3374–3387

  21. [21]

    Jai-Ming Lin, Szu-Ting Li, and Yi-Ting Wang. 2019. Routability-driven mixed-size placement prototyping approach considering design hier- archy and indirect connectivity between macros. InProceedings of the 56th Design Automation Conference

  22. [22]

    Yibo Lin, Zixuan Jiang, Jiaqi Gu, Wuxi Li, Shounak Dhar, Haoxing Ren, Brucek Khailany, and David Z Pan. 2020. DREAMPlace: Deep learning toolkit-enabled gpu acceleration for modern VLSI placement. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems40, 4 (2020), 748–761

  23. [23]

    Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. 2023. Flow matching for generative modeling. InProceed- ings of the 11th International Conference on Learning Representations

  24. [24]

    Xingchao Liu, Chengyue Gong, and Qiang Liu. 2022. Flow straight and fast: Learning to generate and transfer data with rectified flow. In Proceedings of the 10th International Conference on Learning Represen- tations

  25. [25]

    Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, Joe Wenjie Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee, Eric Johnson, Omkar Pathak, Azade Nazi, et al. 2021. A graph placement methodology for fast chip design.Nature594, 7862 (2021), 207–212

  26. [26]

    Hiroshi Murata, Kunihiro Fujiyoshi, Shigetoshi Nakatake, and Yoji Kajitani. 1996. VLSI module placement based on rectangle-packing by the sequence-pair.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems15, 12 (1996), 1518–1524

  27. [27]

    John K Ousterhout. 1984. Corner stitching: A data-structuring tech- nique for VLSI layout tools.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems3, 1 (1984), 87–100

  28. [28]

    Yuan Pu, Tinghuan Chen, Zhuolun He, Chen Bai, Haisheng Zheng, Yibo Lin, and Bei Yu. 2024. IncreMacro: Incremental macro placement refinement. InProceedings of the 2024 International Symposium on Physical Design

  29. [29]

    Yunqi Shi, Xi Lin, Siyuan Xu, Shixiong Kai, Ke Xue, Mingxuan Yuan, Chao Qian, and Zhi-Hua Zhou. 2025. ReMaP: Macro placement by re- cursively prototyping and periphery-guided relocating. InProceedings of the 62nd Design Automation Conference

  30. [30]

    Yunqi Shi, Ke Xue, Lei Song, and Chao Qian. 2023. Macro placement by wire-mask-guided black-box optimization. InAdvances in Neural Information Processing Systems 36

  31. [31]

    Alexander Tong, Kilian Fatras, Nikolay Malkin, Guillaume Huguet, Yanlei Zhang, Jarrid Rector-Brooks, Guy Wolf, and Yoshua Bengio

  32. [32]

    Improving and generalizing flow-based generative models with minibatch optimal transport.Transactions on Machine Learning Re- search(2024), 1–34

  33. [33]

    Gomez, Łukasz Kaiser, and Illia Polosukhin

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. InAdvances in Neural Information Processing Systems 30

  34. [34]

    Meng-Chen Wu and Yao-Wen Chang. 2004. Placement with align- ment and performance constraints using the B*-tree representation. In Proceedings of the 2004 International Conference on Computer Design

  35. [35]

    Ke Xue, Ruo-Tong Chen, Xi Lin, Yunqi Shi, Shixiong Kai, Siyuan Xu, and Chao Qian. 2024. Reinforcement learning policy as macro regulator rather than macro placer. InAdvances in Neural Information Processing Systems 37

  36. [36]

    Jackey Z Yan, Natarajan Viswanathan, and Chris Chu. 2009. Han- dling complexities in modern large-scale mixed-size placement. In Proceedings of the 46th Design Automation Conference