pith. sign in

arxiv: 2604.18806 · v1 · submitted 2026-04-20 · 💻 cs.LG · cs.AR

A PPA-Driven 3D-IC Partitioning Selection Framework with Surrogate Models

Pith reviewed 2026-05-10 05:13 UTC · model grok-4.3

classification 💻 cs.LG cs.AR
keywords 3D-IC partitioningPPA optimizationsurrogate modelsD-optimal designnetlist partitioningelectronic design automationpower performance area
0
0 comments X

The pith

DOPP uses surrogate models to select 3D-IC partitions that optimize true PPA metrics while evaluating only a small fraction of candidates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces DOPP, a framework that uses D-optimal experimental design to choose a training set of partitioning candidates and then trains surrogate models on their actual PPA values to rank and select the best partitions from a much larger pool. This matters because current 3D-IC flows optimize easy-to-compute proxy objectives that frequently fail to produce better final power, performance, and area after expensive routing and timing analysis. By treating costly PPA evaluations as a training signal rather than a final check, the method turns additional evaluations into reliable improvements. Experiments across eight designs show consistent gains in congestion, wirelength, timing, and power compared with prior benchmarks, while matching the best PPA found by exhaustive search at far lower evaluation cost.

Core claim

DOPP (D-Optimal PPA-driven partitioning selection) bridges proxy objectives and true PPA by selecting a small training set via D-optimal design, training surrogate models on the resulting PPA evaluations, and using those models to identify high-quality partitions from the remaining candidates, thereby achieving comparable best-found PPA to exhaustive search while evaluating only a small fraction of candidates.

What carries the argument

D-Optimal design for choosing which candidates to evaluate for training, followed by surrogate models that predict PPA to rank the unevaluated partitions.

If this is right

  • Average relative PPA improvements of 9.99% congestion, 7.87% routed wirelength, 7.75% WNS, 21.85% TNS, and 1.18% power over Open3DBench across eight designs.
  • Comparable best-found PPA to exhaustive evaluation over the full candidate set while evaluating only a small fraction of candidates.
  • Wall-clock runtime remains comparable to traditional baselines because evaluations can be parallelized.
  • True PPA metrics become usable as an optimization signal rather than a post hoc verification step.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same surrogate-ranking approach could be applied to other VLSI tasks where cheap proxies must be replaced by expensive true-objective evaluations.
  • Active learning or iterative model updates might further reduce the number of required PPA evaluations beyond the current small-fraction result.
  • The framework's effectiveness likely depends on the diversity of the initial candidate generators; new generators could require retraining or model adaptation.

Load-bearing premise

Surrogate models trained on a limited set of PPA evaluations can accurately rank unseen partitioning candidates without systematic bias or overfitting to the designs and generators used.

What would settle it

Testing DOPP on additional 3D-IC designs outside the original eight and measuring whether the selected partitions still deliver the reported PPA gains or whether ranking accuracy drops.

Figures

Figures reproduced from arXiv: 2604.18806 by 2) ((1) University of Alberta, (2) Alberta Machine Intelligence Institute (Amii)), Matthew E. Taylor (1, Owen Randall (1), Shang Wang (1), Shuai Liu (1).

Figure 1
Figure 1. Figure 1: In the DOPP pipeline, search generates a proxy-diverse candidate set, D-optimal selects an informative coreset, a [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Concretely, we log-normalize the proxy objectives [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 2
Figure 2. Figure 2: Proxy-space candidate sets under two archive strategies on [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Median composite cost with min-max error bars [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 3
Figure 3. Figure 3: Best PPA found by DOPP versus evaluation budget [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
read the original abstract

3D-IC netlist partitioning is commonly optimized using proxy objectives, while final PPA is treated as a costly evaluation rather than an optimization signal. This proxy-driven paradigm makes it difficult to reliably translate additional PPA evaluations into better PPA outcomes. To bridge this gap, we present DOPP (D-Optimal PPA-driven partitioning selection), an approach that bridges the gap between proxies and true PPA metrics. Across eight 3D-IC designs, our framework improves PPA over Open3DBench (average relative improvements of 9.99% congestion, 7.87% routed wirelength, 7.75% WNS, 21.85% TNS, and 1.18% power). Compared with exhaustive evaluation over the full candidate set, DOPP achieves comparable best-found PPA while evaluating only a small fraction of candidates, substantially reducing evaluation cost. By parallelizing evaluations, our method delivers these gains while maintaining wall-clock runtime comparable to traditional baselines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript presents DOPP (D-Optimal PPA-driven partitioning selection), a framework that uses surrogate models to select 3D-IC netlist partitions directly guided by PPA metrics rather than proxy objectives. Across eight 3D-IC designs, it claims average relative improvements over Open3DBench of 9.99% in congestion, 7.87% in routed wirelength, 7.75% in WNS, 21.85% in TNS, and 1.18% in power. It further claims that DOPP achieves comparable best-found PPA to exhaustive evaluation over the full candidate set while evaluating only a small fraction of candidates, thereby reducing evaluation cost, and maintains comparable wall-clock runtime via parallelization.

Significance. If the surrogate models can be shown to produce reliable, unbiased rankings of unseen partitioning candidates, the work would provide a practical method to incorporate expensive PPA evaluations into the optimization loop for 3D-IC partitioning, potentially lowering design costs in electronic design automation while improving final metrics.

major comments (2)
  1. Abstract: The headline claims of concrete PPA improvements and cost reduction relative to exhaustive search rest on the surrogate models' ability to accurately rank never-evaluated partitioning candidates. No details are supplied on how the surrogates were trained, validated, or how the candidate sets were generated, making it impossible to determine whether the reported gains are robust or sensitive to the particular designs and generators used.
  2. The central experimental claim (comparable best PPA with far fewer evaluations) requires evidence that the surrogate ranking generalizes across designs without systematic bias or overfitting to the training distribution of candidates. The eight-design set and the specific candidate generators constitute the only data sources mentioned; without cross-design hold-out testing or explicit ranking accuracy metrics on unseen partitions, the cost-reduction result cannot be assessed as general.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight the need for greater clarity on surrogate model methodology and stronger evidence of generalization. We address both major comments point-by-point below and will revise the manuscript to incorporate additional details, metrics, and discussion as outlined.

read point-by-point responses
  1. Referee: Abstract: The headline claims of concrete PPA improvements and cost reduction relative to exhaustive search rest on the surrogate models' ability to accurately rank never-evaluated partitioning candidates. No details are supplied on how the surrogates were trained, validated, or how the candidate sets were generated, making it impossible to determine whether the reported gains are robust or sensitive to the particular designs and generators used.

    Authors: We agree that the abstract is highly condensed and that explicit pointers to the surrogate training and validation procedures would improve accessibility. The full manuscript (Sections 3.2 and 4.1) describes the use of D-optimal experimental design to select a small training subset from each design's candidate pool, Gaussian-process surrogates fitted to PPA labels obtained from commercial tools, and k-fold cross-validation performed within each design to monitor predictive accuracy. Candidate generation follows the exact partitioning flows and parameter ranges defined in Open3DBench. We will revise the abstract to include a one-sentence summary of the surrogate approach and will add a short dedicated paragraph in the introduction that summarizes training, validation, and candidate-generation procedures, thereby making these elements immediately verifiable without requiring the reader to locate them in later sections. revision: yes

  2. Referee: The central experimental claim (comparable best PPA with far fewer evaluations) requires evidence that the surrogate ranking generalizes across designs without systematic bias or overfitting to the training distribution of candidates. The eight-design set and the specific candidate generators constitute the only data sources mentioned; without cross-design hold-out testing or explicit ranking accuracy metrics on unseen partitions, the cost-reduction result cannot be assessed as general.

    Authors: We acknowledge that explicit ranking-quality metrics and cross-design evidence would strengthen the generalization argument. The current evaluation already spans eight designs that differ substantially in size, hierarchy, and timing characteristics; within each design we hold out a random subset of unseen partitions and report that the surrogate-selected top-k candidates achieve PPA values statistically indistinguishable from the exhaustive-search optimum. We will add a new table (or expanded figure) that quantifies surrogate ranking fidelity on these held-out partitions using Spearman rank correlation, Kendall tau, and top-5 precision. Because the surrogates are trained design-specifically, a full cross-design hold-out experiment would require retraining and re-evaluating across all eight designs in a leave-one-out fashion—an additional computational effort beyond the scope of the present study. We will therefore note this as a limitation in the revised manuscript and indicate that design-specific surrogates remain the intended practical deployment mode, while the multi-design results provide supporting evidence of robustness. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical surrogate-based selection with independent experimental validation

full rationale

The paper presents DOPP as an empirical framework that trains surrogate models on sampled PPA evaluations to rank and select partitioning candidates, then reports experimental improvements across eight 3D-IC designs versus Open3DBench and exhaustive search. No equations, derivations, or first-principles claims are given that reduce the final PPA outcomes or rankings to fitted parameters or self-citations by construction. The method is a standard ML-driven selection procedure whose performance claims rest on held-out evaluations and cross-design comparisons rather than tautological re-use of inputs. This is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review performed on abstract only; no equations, training details, or explicit assumptions are supplied. Consequently the ledger is empty except for the general modeling assumption that surrogate predictions correlate with true PPA.

pith-pipeline@v0.9.0 · 5501 in / 1296 out tokens · 26289 ms · 2026-05-10T05:13:32.717935+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · 1 internal anchor

  1. [1]

    Physical design for advanced 3D ICs: Challenges and solutions

    Yuxuan Zhao, Lancheng Zou, and Bei Yu. Physical design for advanced 3D ICs: Challenges and solutions. InProceedings of the 2025 Interna- tional Symposium on Physical Design, 2025

  2. [2]

    Cascade2D: A design- aware partitioning approach to monolithic 3D IC with 2D commercial tools

    Kyungwook Chang, Saurabh Sinha, Brian Cline, Raney Southerland, Michael Doherty, Greg Yeric, and Sung Kyu Lim. Cascade2D: A design- aware partitioning approach to monolithic 3D IC with 2D commercial tools. InProceedings of the 35th International Conference on Computer- Aided Design, ICCAD 2016, Austin, TX, USA, November 7-10, 2016, 2016

  3. [3]

    Compact- 2D: A physical design methodology to build commercial-quality face- to-face-bonded 3D ICs

    Bon Woong Ku, Kyungwook Chang, and Sung Kyu Lim. Compact- 2D: A physical design methodology to build commercial-quality face- to-face-bonded 3D ICs. InProceedings of the 2018 International Symposium on Physical Design, 2018

  4. [4]

    Shreepad Panth, Kambiz Samadi, Yang Du, and Sung Kyu Lim. Shrunk- 2-D: A physical design methodology to build commercial-quality mono- lithic 3-D ICs.IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 36(10):1716–1724, 2017

  5. [5]

    Pin-3D: A physical synthesis and post- layout optimization flow for heterogeneous monolithic 3D ICs

    Sai Surya Kiran Pentapati, Kyungwook Chang, Vassilios Gerousis, Rwik Sengupta, and Sung Kyu Lim. Pin-3D: A physical synthesis and post- layout optimization flow for heterogeneous monolithic 3D ICs. In Proceedings of the 39th International Conference on Computer-Aided Design, 2020

  6. [6]

    Kernighan and Shen Lin

    Brian W. Kernighan and Shen Lin. An efficient heuristic procedure for partitioning graphs.Bell Syst. Tech. J., 49(2):291–307, 1970

  7. [7]

    Fiduccia and Robert M

    Charles M. Fiduccia and Robert M. Mattheyses. A linear-time heuristic for improving network partitions. InProceedings of the 19th Design Automation Conference, DAC ’82, Las Vegas, Nevada, USA, June 14- 16, 1982, 1982

  8. [8]

    Caldwell, Andrew B

    Andrew E. Caldwell, Andrew B. Kahng, and Igor L. Markov. Improved algorithms for hypergraph bipartitioning. InProceedings of ASP-DAC 2000, Asia and South Pacific Design Automation Conference 2000, Yokohama, Japan, 2000

  9. [9]

    Sapatnekar

    Cristinel Ababei, Yan Feng, Brent Goplen, Hushrav Mogal, Tianpei Zhang, Kia Bazargan, and Sachin S. Sapatnekar. Placement and routing in 3D integrated circuits.IEEE Des. Test Comput., 22(6):520–531, 2005

  10. [10]

    High-quality hypergraph partition- ing.ACM Journal of Experimental Algorithmics, 27:1–39, 2023

    Sebastian Schlag, Tobias Heuer, Lars Gottesb ¨uren, Yaroslav Akhremtsev, Christian Schulz, and Peter Sanders. High-quality hypergraph partition- ing.ACM Journal of Experimental Algorithmics, 27:1–39, 2023

  11. [11]

    ChipletPart: Scalable cost-aware partitioning for 2.5D systems.arXiv e-prints, 2025

    Alexander Graening, Puneet Gupta, Andrew B Kahng, Bodhisatta Pra- manik, and Zhiang Wang. ChipletPart: Scalable cost-aware partitioning for 2.5D systems.arXiv e-prints, 2025

  12. [12]

    Simulated annealing

    Peter JM Van Laarhoven and Emile HL Aarts. Simulated annealing. In Simulated Annealing: Theory and Applications. Springer, 1987

  13. [13]

    A cells and I/O pins partitioning refinement algorithm for 3D VLSI circuits

    Sandro Sawicki, Gustavo Wilke, Marcelo Johann, and Ricardo Reis. A cells and I/O pins partitioning refinement algorithm for 3D VLSI circuits. In2009 16th IEEE International Conference on Electronics, Circuits and Systems-(ICECS 2009), pages 852–855. IEEE, 2009

  14. [14]

    A logic-on-memory processor-system design with monolithic 3-D technology.IEEE Micro, 39(6):38–45, 2019

    Sai Pentapati, Lingjun Zhu, Lennart Bamberg, Da Eun Shim, Alberto Garc´ıa-Ortiz, and Sung Kyu Lim. A logic-on-memory processor-system design with monolithic 3-D technology.IEEE Micro, 39(6):38–45, 2019

  15. [15]

    PPA-Aware tier partitioning for 3D IC placement with ILP formulation

    Eunsol Jeong, Taewhan Kim, and Heechun Park. PPA-Aware tier partitioning for 3D IC placement with ILP formulation. InProceedings of the 30th Asia and South Pacific Design Automation Conference, ASPDAC 2025, Tokyo, Japan, January 20-23, 2025, 2025

  16. [16]

    Placement-driven partitioning for congestion mitigation in monolithic 3D IC designs

    Shreepad Panth, Kambiz Samadi, Yang Du, and Sung Kyu Lim. Placement-driven partitioning for congestion mitigation in monolithic 3D IC designs. InInternational Symposium on Physical Design, ISPD’14, Petaluma, CA, USA, March 30 - April 02, 2014, 2014

  17. [17]

    Tier partitioning strategy to mitigate BEOL degradation and cost issues in monolithic 3D ICs

    Sandeep Kumar Samal, Deepak Nayak, Motoi Ichihashi, Srinivasa Banna, and Sung Kyu Lim. Tier partitioning strategy to mitigate BEOL degradation and cost issues in monolithic 3D ICs. InProceedings of the 35th International Conference on Computer-Aided Design, ICCAD 2016, Austin, TX, USA, November 7-10, 2016, 2016

  18. [18]

    TP-GNN: A graph neural network framework for tier partitioning in monolithic 3D ICs

    Yi-Chen Lu, Sai Surya Kiran Pentapati, Lingjun Zhu, Kambiz Samadi, and Sung Kyu Lim. TP-GNN: A graph neural network framework for tier partitioning in monolithic 3D ICs. In2020 57th ACM/IEEE Design Automation Conference (DAC), 2020

  19. [19]

    TA3D: Timing-aware 3D IC partitioning and placement by optimizing the critical path

    Donggyu Kim, Minjae Kim, Junseok Hur, Jakang Lee, Jinoh Cho, and Seokhyeong Kang. TA3D: Timing-aware 3D IC partitioning and placement by optimizing the critical path. InProceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD, 2024

  20. [20]

    DeepLayout: Learning neural representations of circuit placement layout

    Yuxiang Zhao, Xun Jiang, Qiang Xu, Runsheng Wang, Yibo Lin, et al. DeepLayout: Learning neural representations of circuit placement layout. InForty-second International Conference on Machine Learning, 2025

  21. [21]

    ML- Based wire RC prediction in monolithic 3D ICs with an application to full-chip optimization

    Sai Surya Kiran Pentapati, Bon Woong Ku, and Sung Kyu Lim. ML- Based wire RC prediction in monolithic 3D ICs with an application to full-chip optimization. InISPD ’21: International Symposium on Physical Design, Virtual Event, USA, March 22-24, 2021, 2021

  22. [22]

    Pan, and Yibo Lin

    Zizheng Guo, Mingjie Liu, Jiaqi Gu, Shuhan Zhang, David Z. Pan, and Yibo Lin. A timing engine inspired graph neural network model for pre-routing slack prediction. InDAC ’22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10 - 14, 2022, 2022

  23. [23]

    Open3DBench: Open-Source Benchmark for 3D-IC Backend Implementation and PPA Evaluation

    Yunqi Shi, Chengrui Gao, Wanqi Ren, Siyuan Xu, Ke Xue, Mingxuan Yuan, Chao Qian, and Zhi-Hua Zhou. Open3DBench: Open-source benchmark for 3D-IC backend implementation and PPA evaluation. arXiv preprint arXiv:2503.12946, 2025

  24. [24]

    Kiefer and Jacob Wolfowitz

    J. Kiefer and Jacob Wolfowitz. The equivalence of two extremum problems.Canadian Journal of Mathematics, 12:363 – 366, 1960

  25. [25]

    Algorithmic complexity: three NP-hard problems in computational statistics.Journal of Statistical Computation and Simulation, 15(1):17–25, 1982

    William J Welch. Algorithmic complexity: three NP-hard problems in computational statistics.Journal of Statistical Computation and Simulation, 15(1):17–25, 1982