FlowPlace: Flow Matching for Chip Placement
Pith reviewed 2026-05-08 05:03 UTC · model grok-4.3
The pith
FlowPlace applies flow matching to chip placement to deliver overlap-free layouts with superior PPA metrics and 10-50 times faster sampling than diffusion models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
FlowPlace shows that flow matching, when paired with mask-guided synthetic data generation and hard-constraint sampling, can directly produce valid chip placements that improve PPA metrics while running 10-50 times faster than prior generative baselines and guaranteeing zero cell overlaps.
What carries the argument
Flow matching, which learns a continuous vector field that transports samples from a flexible prior distribution to the target distribution of legal placements.
If this is right
- Placements satisfy hard legality constraints without post-processing or gradient-based repair steps.
- Sampling time drops by 10-50x compared with diffusion-model baselines, enabling more placement candidates to be evaluated inside a fixed time budget.
- PPA metrics improve on both OpenROAD and ICCAD 2015 suites relative to prior generative methods.
- The same trained model can accept different prior distributions at inference time without retraining.
Where Pith is reading between the lines
- The hard-constraint sampling mechanism could be adapted to other geometric layout problems that require strict non-overlap rules.
- Because flow matching supports flexible priors, the method might allow designers to inject domain-specific starting points such as partial manual floorplans.
- Faster sampling opens the possibility of coupling the placer inside an inner optimization loop that jointly tunes cell sizing and routing congestion.
Load-bearing premise
That the combination of mask-guided data and hard-constraint sampling during inference will continue to yield high-quality, overlap-free placements when the method is applied to industrial-scale designs beyond the tested benchmarks.
What would settle it
Running FlowPlace on a new large-scale netlist and observing either cell overlaps greater than zero or PPA metrics worse than the best diffusion-based or traditional placers on that netlist.
Figures
read the original abstract
Chip placement plays an important role in physical design. While generative models like diffusion models offer promising learning-based solutions, current methods have the following limitations: they use random synthetic data for pre-training, require long sampling times, and often result in overlaps due to their dependence on gradient-based solvers during the sampling process. To overcome these issues, we propose FlowPlace, which features mask-guided synthetic data generation, flow-based efficient training with flexible prior injection, and hard constraint sampling for overlap-free layouts. Experiments on OpenROAD and ICCAD 2015 benchmarks show FlowPlace achieves better PPA metrics, 10-50$\times$ faster sampling efficiency, and zero overlaps.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes FlowPlace, a flow-matching generative model for chip placement. It addresses limitations of diffusion models by introducing mask-guided synthetic data generation for pre-training, flow-based training with flexible prior injection, and hard-constraint sampling during inference to enforce overlap-free layouts. Experiments on OpenROAD and ICCAD 2015 benchmarks are reported to yield improved PPA metrics, 10-50× faster sampling, and zero overlaps relative to prior generative approaches.
Significance. If the empirical claims hold under scrutiny, the work could advance learning-based physical design by improving sampling efficiency and hard-constraint satisfaction in generative placement models. Flow matching combined with explicit constraint handling offers a promising direction for EDA automation, though its practical significance hinges on whether the method scales beyond the small academic benchmarks tested and preserves quality under industrial netlist complexity.
major comments (3)
- [§4 (Experiments)] §4 (Experiments): the central PPA, speed, and zero-overlap claims rest on benchmark results that lack error bars, ablation studies isolating the mask-guided data generation and hard-constraint components, and any description of how hard constraints are enforced inside the flow ODE without degrading the reported metrics or introducing instability.
- [§3 (Method)] §3 (Method): no equations, pseudocode, or derivation details are supplied for the hard-constraint sampling procedure or the mask-guided data generation process; without these it is impossible to verify that the claimed 10-50× sampling speedup is preserved once constraints are applied or that the approach is free of hidden fitting parameters.
- [§5 (Discussion/Conclusion)] §5 (Discussion/Conclusion): the generalization argument from OpenROAD/ICCAD 2015 benchmarks to industrial designs is unsupported; no evidence is given that the flow ODE remains stable or that constraint enforcement preserves quality when routing, timing, and heterogeneous constraints absent from the tested netlists are introduced.
minor comments (2)
- [Abstract and §2] Abstract and §2: the term 'flexible prior injection' is used without a concise definition or reference to the relevant flow-matching literature, which may hinder readers outside the immediate subfield.
- [Figures/Tables] Figure captions and tables: several result tables lack units or baseline method versions, making direct comparison of the 10-50× speedup claim difficult.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below with specific plans for revision where the manuscript can be strengthened, while providing honest clarification on points that require additional context from our work.
read point-by-point responses
-
Referee: [§4 (Experiments)] the central PPA, speed, and zero-overlap claims rest on benchmark results that lack error bars, ablation studies isolating the mask-guided data generation and hard-constraint components, and any description of how hard constraints are enforced inside the flow ODE without degrading the reported metrics or introducing instability.
Authors: We agree that the experimental section would benefit from greater statistical rigor and isolation of components. In the revised manuscript we will add error bars computed across multiple random seeds for all PPA, runtime, and overlap metrics. We will also include new ablation tables that separately disable mask-guided data generation and hard-constraint sampling to quantify their individual contributions. For the hard-constraint enforcement inside the flow ODE, we will add a dedicated paragraph in §4 explaining the post-integration projection step, together with empirical timing and stability measurements confirming that the 10-50× speedup is retained and no additional instability is introduced. revision: yes
-
Referee: [§3 (Method)] no equations, pseudocode, or derivation details are supplied for the hard-constraint sampling procedure or the mask-guided data generation process; without these it is impossible to verify that the claimed 10-50× sampling speedup is preserved once constraints are applied or that the approach is free of hidden fitting parameters.
Authors: We acknowledge the omission of formal details in the method section. The revised §3 will contain the complete mathematical formulation of the mask-guided data generation process (including the mask construction equations and the resulting training objective) as well as the derivation of flexible prior injection in the flow-matching framework. For hard-constraint sampling we will insert the full algorithm in pseudocode and a short derivation showing that the constraint is realized by a single non-iterative projection after each ODE step; this projection adds negligible overhead, thereby preserving the reported speedup and introducing no additional tunable parameters beyond those already stated. revision: yes
-
Referee: [§5 (Discussion/Conclusion)] the generalization argument from OpenROAD/ICCAD 2015 benchmarks to industrial designs is unsupported; no evidence is given that the flow ODE remains stable or that constraint enforcement preserves quality when routing, timing, and heterogeneous constraints absent from the tested netlists are introduced.
Authors: We agree that the current evaluation is limited to academic benchmarks and does not constitute direct evidence of industrial-scale generalization. In the revised Discussion we will explicitly delineate these limitations, discuss the potential impact of additional routing, timing, and heterogeneous constraints on ODE stability, and report any observed stability metrics from our existing experiments. Full validation on proprietary industrial netlists lies outside the scope of this work; we will therefore frame the generalization claim more cautiously while highlighting the method’s design choices that aim to support future scaling. revision: partial
Circularity Check
No circularity; empirical results on external benchmarks with no derivation chain
full rationale
The paper proposes FlowPlace as a flow-matching approach with mask-guided synthetic data, flexible prior injection, and hard-constraint sampling. All performance claims (better PPA, 10-50x faster sampling, zero overlaps) are presented as direct experimental outcomes on the OpenROAD and ICCAD 2015 benchmarks. No equations, first-principles derivations, fitted-parameter predictions, or self-citation chains are described that would reduce any result to its own inputs by construction. The method is therefore self-contained against external benchmarks rather than internally tautological.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Flow matching can be trained on mask-guided synthetic placement data to produce valid layouts
- domain assumption Hard constraint sampling during inference enforces zero overlaps without harming PPA metrics
Reference graph
Works this paper leans on
-
[1]
Anthony Agnesina, Puranjay Rajvanshi, Tian Yang, Geraldo Pradipta, Austin Jiao, Ben Keller, Brucek Khailany, and Haoxing Ren. 2023. Au- toDMP: Automated DREAMPlace-based macro placement. InProceed- ings of the 2023 International Symposium on Physical Design
work page 2023
-
[2]
Tutu Ajayi, Vidya A Chhabria, Mateus Fogaça, Soheil Hashemi, Abdel- rahman Hosny, Andrew B Kahng, Minsoo Kim, Jeongsup Lee, Uday Mallappa, Marina Neseem, et al. 2019. Toward an open-source digital flow: First learnings from the openroad project. InProceedings of 56th Design Automation Conference
work page 2019
-
[3]
Shaked Brody, Uri Alon, and Eran Yahav. 2022. How attentive are graph attention networks?. InProceedings of the 10th International Conference on Learning Representations
work page 2022
-
[4]
Yifan Chen, Zaiwen Wen, Yun Liang, and Yibo Lin. 2023. Stronger mixed-size placement backbone considering second-order information. InProceedings of the 42nd International Conference on Computer Aided Design
work page 2023
-
[5]
Maddix, Abdul Fatir Ansari, Andrew Stuart, Michael W
Chaoran Cheng, Boran Han, Danielle C. Maddix, Abdul Fatir Ansari, Andrew Stuart, Michael W. Mahoney, and Bernie Wang. 2025. Gradient- free generation for hard-constrained systems. InProceedings of the 13th International Conference on Learning Representations
work page 2025
-
[6]
Chung-Kuan Cheng, Andrew B Kahng, Ilgweon Kang, and Lutong Wang. 2018. Replace: Advancing solution quality and routability validation in global placement.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems38, 9 (2018), 1717–1730
work page 2018
-
[7]
Chung-Kuan Cheng, Andrew B Kahng, Sayak Kundu, Yucheng Wang, and Zhiang Wang. 2023. Assessment of reinforcement learning for macro placement. InProceedings of the 2023 International Symposium on Physical Design
work page 2023
-
[8]
Ruoyu Cheng and Junchi Yan. 2021. On joint learning for solving place- ment and routing in chip design. InAdvances in Neural Information Processing Systems 34
work page 2021
-
[9]
Zijie Geng, Jie Wang, Ziyan Liu, Siyuan Xu, Zhentao Tang, Mingxuan Yuan, Jianye Hao, Yongdong Zhang, and Feng Wu. 2024. Reinforcement learning within tree search for fast macro placement. InProceedings of the 41st International Conference on Machine Learning
work page 2024
- [10]
-
[11]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. InAdvances in Neural Information Processing Systems 33
work page 2020
-
[12]
Xianlong Hong, Gang Huang, Yici Cai, Jiangchun Gu, Sheqin Dong, Chung-Kuan Cheng, and Jun Gu. 2000. Corner block list: An effective and efficient topological representation of non-slicing floorplan. In Proceedings of the 13th International Conference on Computer Aided Design
work page 2000
-
[13]
Andrew B Kahng, Ravi Varadarajan, and Zhiang Wang. 2023. Hier- RTLMP: A hierarchical automatic macro placer for large-scale complex IP blocks.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems43, 5 (2023)
work page 2023
-
[14]
Myung-Chul Kim, Jin Hu, Jiajia Li, and Natarajan Viswanathan. 2015. ICCAD-2015 CAD contest in incremental timing-driven placement and benchmark suite. InProceedings of the 34th International Conference on Computer-Aided Design
work page 2015
-
[15]
Yao Lai, Jinxin Liu, Zhentao Tang, Bin Wang, Jianye Hao, and Ping Luo
-
[16]
InProceedings of the 40th International Conference on Machine Learning
ChiPFormer: Transferable chip placement via offline decision transformer. InProceedings of the 40th International Conference on Machine Learning
-
[17]
Yao Lai, Yao Mu, and Ping Luo. 2022. MaskPlace: Fast chip placement via reinforced visual representation learning. InAdvances in Neural Information Processing Systems 35
work page 2022
-
[18]
Vint Lee, Minh Nguyen, Leena Elzeiny, Chun Deng, Pieter Abbeel, and John Wawrzynek. 2025. Chip placement with diffusion models. In Proceedings of the 42nd International Conference on Machine Learning
work page 2025
-
[19]
Peiyu Liao, Dawei Guo, Zizheng Guo, Siting Liu, Yibo Lin, and Bei Yu
-
[20]
DREAMPlace 4.0: Timing-driven placement with momentum- based net weighting and lagrangian-based refinement.IEEE Transac- tions on Computer-Aided Design of Integrated Circuits and Systems42, 10 (2023), 3374–3387
work page 2023
-
[21]
Jai-Ming Lin, Szu-Ting Li, and Yi-Ting Wang. 2019. Routability-driven mixed-size placement prototyping approach considering design hier- archy and indirect connectivity between macros. InProceedings of the 56th Design Automation Conference
work page 2019
-
[22]
Yibo Lin, Zixuan Jiang, Jiaqi Gu, Wuxi Li, Shounak Dhar, Haoxing Ren, Brucek Khailany, and David Z Pan. 2020. DREAMPlace: Deep learning toolkit-enabled gpu acceleration for modern VLSI placement. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems40, 4 (2020), 748–761
work page 2020
-
[23]
Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. 2023. Flow matching for generative modeling. InProceed- ings of the 11th International Conference on Learning Representations
work page 2023
-
[24]
Xingchao Liu, Chengyue Gong, and Qiang Liu. 2022. Flow straight and fast: Learning to generate and transfer data with rectified flow. In Proceedings of the 10th International Conference on Learning Represen- tations
work page 2022
-
[25]
Azalia Mirhoseini, Anna Goldie, Mustafa Yazgan, Joe Wenjie Jiang, Ebrahim Songhori, Shen Wang, Young-Joon Lee, Eric Johnson, Omkar Pathak, Azade Nazi, et al. 2021. A graph placement methodology for fast chip design.Nature594, 7862 (2021), 207–212
work page 2021
-
[26]
Hiroshi Murata, Kunihiro Fujiyoshi, Shigetoshi Nakatake, and Yoji Kajitani. 1996. VLSI module placement based on rectangle-packing by the sequence-pair.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems15, 12 (1996), 1518–1524
work page 1996
-
[27]
John K Ousterhout. 1984. Corner stitching: A data-structuring tech- nique for VLSI layout tools.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems3, 1 (1984), 87–100
work page 1984
-
[28]
Yuan Pu, Tinghuan Chen, Zhuolun He, Chen Bai, Haisheng Zheng, Yibo Lin, and Bei Yu. 2024. IncreMacro: Incremental macro placement refinement. InProceedings of the 2024 International Symposium on Physical Design
work page 2024
-
[29]
Yunqi Shi, Xi Lin, Siyuan Xu, Shixiong Kai, Ke Xue, Mingxuan Yuan, Chao Qian, and Zhi-Hua Zhou. 2025. ReMaP: Macro placement by re- cursively prototyping and periphery-guided relocating. InProceedings of the 62nd Design Automation Conference
work page 2025
-
[30]
Yunqi Shi, Ke Xue, Lei Song, and Chao Qian. 2023. Macro placement by wire-mask-guided black-box optimization. InAdvances in Neural Information Processing Systems 36
work page 2023
-
[31]
Alexander Tong, Kilian Fatras, Nikolay Malkin, Guillaume Huguet, Yanlei Zhang, Jarrid Rector-Brooks, Guy Wolf, and Yoshua Bengio
-
[32]
Improving and generalizing flow-based generative models with minibatch optimal transport.Transactions on Machine Learning Re- search(2024), 1–34
work page 2024
-
[33]
Gomez, Łukasz Kaiser, and Illia Polosukhin
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. InAdvances in Neural Information Processing Systems 30
work page 2017
-
[34]
Meng-Chen Wu and Yao-Wen Chang. 2004. Placement with align- ment and performance constraints using the B*-tree representation. In Proceedings of the 2004 International Conference on Computer Design
work page 2004
-
[35]
Ke Xue, Ruo-Tong Chen, Xi Lin, Yunqi Shi, Shixiong Kai, Siyuan Xu, and Chao Qian. 2024. Reinforcement learning policy as macro regulator rather than macro placer. InAdvances in Neural Information Processing Systems 37
work page 2024
-
[36]
Jackey Z Yan, Natarajan Viswanathan, and Chris Chu. 2009. Han- dling complexities in modern large-scale mixed-size placement. In Proceedings of the 46th Design Automation Conference
work page 2009
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.