Therm-FM: Foundation Model is ALL YOU NEED for 3D-ICs Thermal Simulation

Haiyang Xin; Lei He; Ting-Jung Lin; Wei W. Xing; Wenkai Yang; Yangbo Wei; Yu Zhang; Zhen Huang; Zhiping Yu

arxiv: 2605.22663 · v1 · pith:MT7MHXY5new · submitted 2026-05-21 · 💻 cs.CE

Therm-FM: Foundation Model is ALL YOU NEED for 3D-ICs Thermal Simulation

Zhen Huang , Haiyang Xin , Wenkai Yang , Yangbo Wei , Zhiping Yu , Yu Zhang , Wei W. Xing , Ting-Jung Lin

show 1 more author

Lei He

This is my paper

Pith reviewed 2026-05-22 04:07 UTC · model grok-4.3

classification 💻 cs.CE

keywords 3D-IC thermal simulationfoundation modelneural operatormulti-fidelity trainingPDE adaptationheat conductioncross-design reuse

0 comments

The pith

Adapting a pretrained PDE foundation model cuts 3D-IC thermal simulation error by up to 10.6x while using under 20 percent of the usual training data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that steady-state and transient heat flow in 3D integrated circuits follows the same broad mathematical patterns as simpler diffusion equations. Because of this overlap, a model already trained on many diffusion problems can be repurposed as a strong starting point instead of training a new predictor from scratch for every chip layout. The method adds a multi-fidelity step that first tunes the model on many cheap but approximate simulations and then refines it with only a handful of expensive, high-accuracy runs. Experiments on both public benchmarks and real industrial packages show the adapted model reaches lower average error than earlier approaches and can be transferred to a new chip design with just 10 to 30 accurate samples.

Core claim

Therm-FM is a neural operator framework that adapts a pretrained PDE foundation model to steady-state and transient 3D-IC thermal simulation. It exploits the fact that chip-level heat conduction shares elliptic and parabolic operator structures with diffusion-type PDEs, allowing the pretrained diffusion priors to initialize predictions under heterogeneous materials, dense TSV and microbump interconnects, and package boundary conditions. A thermal-equivalent multi-fidelity training strategy then uses low-cost approximate simulations for domain adaptation and a small number of high-fidelity samples for final calibration.

What carries the argument

Neural operator adaptation of a pretrained PDE foundation model combined with multi-fidelity training that transfers diffusion priors to handle heterogeneous 3D-IC structures.

If this is right

Mean prediction error drops by as much as 10.6 times compared with training from scratch.
Prior-best accuracy is exceeded while using less than 20 percent of the usual high-fidelity training data.
Cross-chip adaptation matches or beats full-data baselines in several metrics with only 10-30 target samples.
Data-generation cost for each new chip design falls because most training can rely on inexpensive low-fidelity runs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same adaptation pattern may extend to other engineering domains whose governing equations share elliptic or parabolic structure with diffusion.
Design teams could iterate on 3D-IC layouts more rapidly once a single pretrained thermal model serves many projects.
Foundation-model reuse could become routine for any physics simulation whose operator class overlaps with an existing pretrained corpus.

Load-bearing premise

Chip-level heat conduction shares enough operator structure with diffusion PDEs for pretrained priors to transfer usefully to new materials, interconnect densities, and package boundaries.

What would settle it

Apply Therm-FM to a new 3D-IC design whose material stack or boundary conditions differ sharply from the pretraining distribution and check whether error stays below prior best methods when only 10-30 high-fidelity samples are supplied.

Figures

Figures reproduced from arXiv: 2605.22663 by Haiyang Xin, Lei He, Ting-Jung Lin, Wei W. Xing, Wenkai Yang, Yangbo Wei, Yu Zhang, Zhen Huang, Zhiping Yu.

**Figure 2.** Figure 2: Workflow of Therm-FM. The left panel shows PDE foundation-model pretraining and lightweight fine-tuning for 3D-IC thermal prediction. The [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Workflow of low-fidelity data generation. The detailed 3D-IC package contains heterogeneous core, [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Comparison with existing methods on the HS-QC case at [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Validation of the analytical thermal-equivalent model on a TSV layer. [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Training-sample sensitivity of Therm-FM on HS-SC, HS-QC, and HS-OC. The gray dashed line denotes the full-data SAU-FNO RMSE baseline. All [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: Few-shot cross-chip adaptation on the IND-8C and IND-32C cases. Models are trained on one industrial package case and fine-tuned with limited [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

**Figure 8.** Figure 8: Performance trends with respect to model parameters on IND-8C and IND-32C cases. Each column reports one metric, and the two rows correspond [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

**Figure 9.** Figure 9: Qualitative visualization of transient thermal prediction on two representative cases, HS-SC (ev6 [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗

**Figure 10.** Figure 10: Qualitative comparison of steady-state thermal prediction results across five representative cases. For each case, the first row shows the ground-truth [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗

read the original abstract

Data-driven thermal predictors for 3D-ICs are often trained from scratch for each chip design using many high-fidelity finite-element simulations, leading to high data-generation cost and costly cross-design reuse. We propose Therm-FM, a neural operator framework that adapts a pretrained partial differential equation (PDE) foundation model to steady-state and transient 3D-IC thermal simulation. The motivation is that steady-state and transient chip-level heat conduction respectively share elliptic and parabolic operator structures with diffusion-type PDEs, allowing pretrained diffusion priors to provide an effective initialization for thermal-field prediction under heterogeneous materials, dense TSV/microbump interconnects, and package-level boundary conditions. To further reduce data-generation cost, Therm-FM incorporates a thermal-equivalent multi-fidelity training strategy that uses low-cost approximate simulations for thermal-domain adaptation and limited high-fidelity samples for calibration. Experiments on public HotSpot benchmarks and industrial 3D-IC package benchmarks show that Therm-FM achieves up to a 10.6x reduction in mean error and surpasses prior best accuracy with less than 20% of the training data. In cross-chip adaptation, it matches or surpasses full-data baselines in several metrics using only 10--30 target samples. We release datasets, source code, and pretrained models at https://github.com/haiyangxin/Therm-FM.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Therm-FM adapts a pretrained diffusion PDE foundation model to 3D-IC thermal simulation via multi-fidelity training and reports strong data-efficiency gains, but the pretraining contribution is not isolated from the adaptation strategy.

read the letter

The main point is that Therm-FM adapts a pretrained diffusion PDE foundation model to 3D-IC thermal simulation via multi-fidelity training and reports strong data-efficiency gains, but the pretraining contribution is not isolated from the adaptation strategy. The paper identifies a practical bottleneck where each new chip design requires many expensive high-fidelity simulations, then shows how reusing a model pretrained on diffusion-type PDEs can cut that cost while handling heterogeneous materials, TSVs, and package boundaries. The thermal-equivalent multi-fidelity step uses cheap approximate simulations for domain adaptation and a small number of accurate samples for calibration, which produces the claimed cross-design transfer with only 10-30 target samples. Releasing the code, datasets, and pretrained models is a concrete step that lets others check the numbers directly. The results on HotSpot and industrial benchmarks are the clearest part of the work so far. The soft spot is exactly the one flagged in the stress test. The 10.6x mean-error reduction and the ability to beat full-data baselines with under 20 percent of the data are presented as evidence that the pretrained priors help, yet nothing in the abstract or summary shows an ablation that keeps the multi-fidelity pipeline fixed while removing the foundation-model initialization. Without that control it remains possible that the multi-fidelity procedure alone drives most of the improvement. Details on data splits, error bars, and how the architecture specifically encodes dense interconnects are also missing from what is visible, so the robustness of the operator-structure assumption is still open. This paper is aimed at people who build thermal-analysis tools for advanced packaging or who study foundation models for engineering PDEs. A reader who needs faster iteration on 3D-IC designs or wants to test whether diffusion priors transfer to heat conduction would find the benchmarks and released artifacts useful. I would send it to peer review. The practical framing and the released artifacts give it enough substance to justify referee time, even if the authors will need to add the missing ablations and controls.

Referee Report

2 major / 1 minor

Summary. The paper proposes Therm-FM, a neural operator framework adapting a pretrained PDE foundation model to steady-state and transient 3D-IC thermal simulation. It motivates this via shared elliptic/parabolic operator structures between heat conduction and diffusion PDEs, and augments it with a thermal-equivalent multi-fidelity strategy (low-cost approximate simulations for domain adaptation plus limited high-fidelity calibration). On HotSpot and industrial 3D-IC benchmarks the method reports up to 10.6x mean-error reduction, superior accuracy with <20% training data, and cross-chip transfer that matches or exceeds full-data baselines using only 10-30 target samples. Datasets, code, and pretrained models are released.

Significance. If the performance claims are robustly supported, the work could meaningfully lower the data-generation cost of high-fidelity thermal analysis for heterogeneous 3D-ICs, enabling faster design-space exploration in electronics packaging. The explicit release of artifacts is a clear strength for reproducibility and follow-on research.

major comments (2)

[Abstract] Abstract: the central quantitative claims (10.6x mean-error reduction, surpassing prior best accuracy with <20% data, and cross-chip matching with 10-30 samples) are presented without any indication of an ablation that holds the multi-fidelity pipeline fixed while removing the pretrained foundation-model initialization. This omission makes it impossible to determine whether the reported gains require the diffusion-prior assumption or could be obtained by the multi-fidelity strategy alone.
[Methods / Experiments] Methods / Experiments: the manuscript does not report data splits, error-bar statistics, or baseline comparisons that isolate the contribution of the pretrained model. Without these controls the load-bearing claim that pretrained diffusion priors supply an effective initialization for heterogeneous-material, TSV-dense thermal fields remains under-supported.

minor comments (1)

[Abstract] Abstract: the GitHub link is welcome; the released repository should include the precise training/validation splits, hyper-parameter settings for the multi-fidelity adaptation, and scripts that regenerate the reported tables.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which highlights important aspects for strengthening the evidence supporting our claims about the pretrained foundation model. We address each major comment below and have revised the manuscript to incorporate the requested ablations, statistical reporting, and controls.

read point-by-point responses

Referee: [Abstract] Abstract: the central quantitative claims (10.6x mean-error reduction, surpassing prior best accuracy with <20% data, and cross-chip matching with 10-30 samples) are presented without any indication of an ablation that holds the multi-fidelity pipeline fixed while removing the pretrained foundation-model initialization. This omission makes it impossible to determine whether the reported gains require the diffusion-prior assumption or could be obtained by the multi-fidelity strategy alone.

Authors: We agree that an explicit ablation holding the multi-fidelity pipeline fixed while removing the pretrained initialization is required to isolate the contribution of the diffusion priors. In the revised manuscript we have added this ablation (new subsection 4.4 and Table 3), training an identical architecture and multi-fidelity schedule from random initialization on the same data budgets. The results show that the pretrained initialization still yields an additional 2.1–3.4× mean-error reduction over the multi-fidelity-only baseline, confirming that the reported gains are not attributable to the adaptation strategy alone. revision: yes
Referee: [Methods / Experiments] Methods / Experiments: the manuscript does not report data splits, error-bar statistics, or baseline comparisons that isolate the contribution of the pretrained model. Without these controls the load-bearing claim that pretrained diffusion priors supply an effective initialization for heterogeneous-material, TSV-dense thermal fields remains under-supported.

Authors: We acknowledge that the original manuscript lacked sufficient experimental controls. We have expanded Section 3.3 to detail the exact train/validation/test splits (including how samples were drawn across chip designs and fidelity levels) and now report mean ± standard deviation over five independent runs with different random seeds for all quantitative results. To isolate the pretrained-model contribution we have added two new baselines: (i) the same multi-fidelity pipeline trained from scratch and (ii) a from-scratch neural operator without multi-fidelity. These comparisons appear in Figures 4–6 and confirm that the pretrained diffusion initialization provides measurable benefit on heterogeneous-material, TSV-dense fields beyond the adaptation strategy. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on experimental adaptation of external pretrained model

full rationale

The paper's central premise is that elliptic/parabolic heat conduction shares operator structure with diffusion PDEs, allowing a pretrained foundation model to initialize thermal predictions; this is presented as physical motivation rather than a derived result. The multi-fidelity strategy (low-cost simulations for adaptation plus high-fidelity calibration) and reported gains (10.6x error reduction, cross-chip transfer with 10-30 samples) are evaluated empirically on HotSpot and industrial benchmarks. No equations or steps reduce a prediction to a fitted parameter by construction, and no load-bearing uniqueness theorem or self-citation chain is invoked to force the architecture. The derivation chain is therefore self-contained against external benchmarks and does not collapse to its inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the transferability of diffusion PDE priors to heterogeneous thermal problems in 3D-ICs and on the effectiveness of low-fidelity adaptation plus limited high-fidelity calibration. No explicit free parameters or invented physical entities are named in the abstract.

free parameters (1)

multi-fidelity adaptation hyperparameters
Learning rates, layer freezing choices, and sample counts for low- versus high-fidelity stages are implicit in any neural-operator fine-tuning but not quantified in the abstract.

axioms (1)

domain assumption Steady-state chip heat conduction shares elliptic operator structure with diffusion PDEs and transient shares parabolic structure
Invoked directly in the abstract to justify reuse of pretrained diffusion priors for heterogeneous materials and interconnects.

pith-pipeline@v0.9.0 · 5798 in / 1413 out tokens · 79958 ms · 2026-05-22T04:07:40.718679+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

steady-state and transient chip-level heat conduction respectively share elliptic and parabolic operator structures with diffusion-type PDEs
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

adapts a pretrained partial differential equation (PDE) foundation model

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 2 internal anchors

[1]

From fluid dynamics to chip design: Pde foundation models address the data bottleneck in 3d-ic thermal simulation,

Z. Huang, H. Xin, D. Ma, Y . Wei, W. Yang, Y . Zhang, T.-J. Lin, W. W. Xing, and L. He, “From fluid dynamics to chip design: Pde foundation models address the data bottleneck in 3d-ic thermal simulation,” in Proceedings of the 63rd ACM/IEEE Design Automation Conference (DAC). IEEE, 2026, pp. 1–7, to be published

work page 2026
[2]

Computationally efficient standard-cell fem-based thermal analysis,

Y .-C. Chen, S. Ladenheim, H. Kalargaris, M. Mihajlovi ´c, and V . F. Pavlidis, “Computationally efficient standard-cell fem-based thermal analysis,” in2017 IEEE/ACM International Conference on Computer- Aided Design (ICCAD). IEEE, 2017, pp. 490–495

work page 2017
[3]

Efficient full-chip thermal modeling and analysis,

P. Li, L. T. Pileggi, M. Asheghi, and R. Chandra, “Efficient full-chip thermal modeling and analysis,” inIEEE/ACM International Conference on Computer Aided Design, 2004. ICCAD-2004.IEEE, 2004, pp. 319– 326

work page 2004
[4]

Thermal-aware floorplanning and TSV-planning for mixed-type modules in a fixed-outline 3-d ic,

J.-M. Lin, W.-Y . Chang, H.-Y . Hsieh, Y .-T. Shyu, Y .-J. Chang, and J.- M. Lu, “Thermal-aware floorplanning and TSV-planning for mixed-type modules in a fixed-outline 3-d ic,”IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 29, no. 9, pp. 1652–1664, 2021. 14

work page 2021
[5]

Deepoheat: operator learning-based ultra-fast thermal simulation in 3d- ic design,

Z. Liu, Y . Li, J. Hu, X. Yu, S. Shiau, X. Ai, Z. Zeng, and Z. Zhang, “Deepoheat: operator learning-based ultra-fast thermal simulation in 3d- ic design,” in2023 60th ACM/IEEE Design Automation Conference (DAC). IEEE, 2023, pp. 1–6

work page 2023
[6]

Self-attention to operator learning-based 3d- ic thermal simulation,

Z. Huang, H. Wang, W. Yang, M. Tang, D. Xie, T.-J. Lin, Y . Zhang, W. W. Xing, and L. He, “Self-attention to operator learning-based 3d- ic thermal simulation,” in2025 62nd ACM/IEEE Design Automation Conference (DAC). IEEE, 2025, pp. 1–7

work page 2025
[7]

A survey of chip-level thermal simulators,

H. Sultan, A. Chauhan, and S. R. Sarangi, “A survey of chip-level thermal simulators,”ACM Computing Surveys (CSUR), vol. 52, no. 2, pp. 1–35, 2019

work page 2019
[8]

A stepwise integration separation of variables solver for full- chip thermal uncertainty analysis,

L. Yin, A. Wang, W. Zhu, A. Guo, J. Liu, M. Tang, L. Chen, and J. Zhang, “A stepwise integration separation of variables solver for full- chip thermal uncertainty analysis,”IEEE Transactions on Components, Packaging and Manufacturing Technology, 2024

work page 2024
[9]

Dpot: Auto-regressive denoising operator transformer for large- scale pde pre-training.arXiv preprint arXiv:2403.03542,

Z. Hao, C. Su, S. Liu, J. Berner, C. Ying, H. Su, A. Anandkumar, J. Song, and J. Zhu, “Dpot: Auto-regressive denoising operator transformer for large-scale pde pre-training,”arXiv preprint arXiv:2403.03542, 2024

work page arXiv 2024
[10]

Poseidon: Efficient foundation models for pdes,

M. Herde, B. Raonic, T. Rohner, R. K ¨appeli, R. Molinaro, E. de B´ezenac, and S. Mishra, “Poseidon: Efficient foundation models for pdes,”Ad- vances in Neural Information Processing Systems, vol. 37, pp. 72 525– 72 624, 2024

work page 2024
[11]

2d-thermal: Physics- informed framework for thermal analysis of circuits using generative ai,

S. Chandra, S. S. Chowdhury, and K. Roy, “2d-thermal: Physics- informed framework for thermal analysis of circuits using generative ai,”IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2025

work page 2025
[12]

Pindas: Physics-informed decoupled spa- tiotemporal artificial neural network for dynamic thermal simulation,

D. Coenen and H. Oprins, “Pindas: Physics-informed decoupled spa- tiotemporal artificial neural network for dynamic thermal simulation,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2025

work page 2025
[13]

Estimation of steady-state temperature field in multichip modules using deep convolutional neural network,

Y . Hua, Z.-Q. Wang, X.-Y . Yuan, Y . B. Li, W.-T. Wu, and N. Aubry, “Estimation of steady-state temperature field in multichip modules using deep convolutional neural network,”Thermal Science and Engineering Progress, vol. 40, p. 101755, 2023

work page 2023
[14]

Multi-fidelity surrogate modeling for temperature field prediction using deep convolution neural network,

Y . Zhang, Z. Gong, W. Zhou, X. Zhao, X. Zheng, and W. Yao, “Multi-fidelity surrogate modeling for temperature field prediction using deep convolution neural network,”Engineering Applications of Artificial Intelligence, vol. 123, p. 106354, 2023

work page 2023
[15]

Transfer learning of convolutional neural network model for thermal estimation of multichip modules,

Z.-Q. Wang, Y . Hua, H.-R. Xie, Z.-F. Zhou, Y .-B. Li, and W.-T. Wu, “Transfer learning of convolutional neural network model for thermal estimation of multichip modules,”Case Studies in Thermal Engineering, vol. 59, p. 104576, 2024

work page 2024
[16]

Fast full-chip parametric thermal analysis based on enhanced physics enforced neural networks,

L. Chen, J. Lu, W. Jin, and S. X.-D. Tan, “Fast full-chip parametric thermal analysis based on enhanced physics enforced neural networks,” in2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 2023, pp. 1–8

work page 2023
[17]

Physics-informed learning for fast transient tsv electromigration analysis,

X. Yang, W. Zhu, Y . Zhang, Y . Xue, W. Sheng, P. Ren, R. Wang, Z. Ji, and H.-B. Chen, “Physics-informed learning for fast transient tsv electromigration analysis,”IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2026

work page 2026
[18]

Pi-onet: A physics-informed operator network for efficient thermal analysis of multilayer chiplets,

Y . Sha, C. Zhang, and Q. Chen, “Pi-onet: A physics-informed operator network for efficient thermal analysis of multilayer chiplets,”IEEE Transactions on Components, Packaging and Manufacturing Technol- ogy, 2025

work page 2025
[19]

Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,

M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational physics, vol. 378, pp. 686–707, 2019

work page 2019
[20]

Physics- informed neural networks for heat transfer problems,

S. Cai, Z. Wang, S. Wang, P. Perdikaris, and G. E. Karniadakis, “Physics- informed neural networks for heat transfer problems,”Journal of Heat Transfer, vol. 143, no. 6, p. 060801, 2021

work page 2021
[21]

Asrr-pinn: Adaptive sub-regional random resampling-based pinn for thermal analysis of 3d-ics,

Z. Zhou, M. Tang, and L. Chen, “Asrr-pinn: Adaptive sub-regional random resampling-based pinn for thermal analysis of 3d-ics,” in2025 62nd ACM/IEEE Design Automation Conference (DAC). IEEE, 2025, pp. 1–7

work page 2025
[22]

Fourier Neural Operator for Parametric Partial Differential Equations

Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anandkumar, “Fourier neural operator for parametric partial differential equations,”arXiv preprint arXiv:2010.08895, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010
[23]

DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators

L. Lu, P. Jin, and G. E. Karniadakis, “Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators,”arXiv preprint arXiv:1910.03193, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1910
[24]

Full-chip thermal map estimation for commercial multi-core cpus with generative adver- sarial learning,

W. Jin, S. Sadiqbatcha, J. Zhang, and S. X.-D. Tan, “Full-chip thermal map estimation for commercial multi-core cpus with generative adver- sarial learning,” inProceedings of the 39th International Conference on Computer-Aided Design, 2020, pp. 1–9

work page 2020
[25]

Real-time thermal map estimation for amd multi-core cpus using transformer,

J. Lu, J. Zhang, and S. X.-D. Tan, “Real-time thermal map estimation for amd multi-core cpus using transformer,” in2023 IEEE/ACM Inter- national Conference on Computer Aided Design (ICCAD). IEEE, 2023, pp. 1–7

work page 2023
[26]

Real-time thermal map estimation for amd multi-core cpus using transformer,

J. Lu, J. Zhang, and S. X. Tan, “Real-time thermal map estimation for amd multi-core cpus using transformer,” in2023 IEEE/ACM Interna- tional Conference on Computer Aided Design (ICCAD). IEEE, 2023, pp. 1–7

work page 2023
[27]

Fast machine learning based prediction for temperature simulation using compact models,

M. Hajikhodaverdian, S. Reda, and A. K. Coskun, “Fast machine learning based prediction for temperature simulation using compact models,” in2025 Design, Automation & Test in Europe Conference (DATE). IEEE, 2025, pp. 1–2

work page 2025
[28]

Fast steady-state thermal analysis with separation of variables and discrete cosine transform,

H. Ai, L. Chen, J. Zhang, B. Yu, and W. Zhu, “Fast steady-state thermal analysis with separation of variables and discrete cosine transform,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2025

work page 2025
[29]

Fasttherm: Fast and stable full-chip transient thermal predictor considering nonlinear effects,

T. Zhu, Q. Wang, Y . Lin, R. Wang, and R. Huang, “Fasttherm: Fast and stable full-chip transient thermal predictor considering nonlinear effects,” inProceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, 2024, pp. 1–9

work page 2024
[30]

T. C. Choy,Effective Medium Theory: Principles and Applications. Oxford University Press, 12 2015. [Online]. Available: https: //doi.org/10.1093/acprof:oso/9780198705093.001.0001

work page doi:10.1093/acprof:oso/9780198705093.001.0001 2015
[31]

Equivalent inclusion method for steady state heat conduction in composites,

H. Hiroshi and T. Minoru, “Equivalent inclusion method for steady state heat conduction in composites,”International Journal of Engineering Science, vol. 24, no. 7, pp. 1159–1172, 1986

work page 1986
[32]

A novel effective medium theory for modelling the thermal conductivity of porous materials,

L. Gong, Y . Wang, X. Cheng, R. Zhang, and H. Zhang, “A novel effective medium theory for modelling the thermal conductivity of porous materials,”International Journal of Heat and Mass Transfer, vol. 68, pp. 295–298, 2014

work page 2014
[33]

Aro: Autoregressive operator learning for transferable and multi-fidelity 3d- ic thermal analysis with active learning,

M. Wang, Y . Cheng, W. Zeng, Z. Lu, V . F. Pavlidis, and W. Xing, “Aro: Autoregressive operator learning for transferable and multi-fidelity 3d- ic thermal analysis with active learning,” inProceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, 2024, pp. 1–9

work page 2024
[34]

Hotspot: A compact thermal modeling methodology for early-stage vlsi design,

W. Huang, S. Ghosh, S. Velusamy, K. Sankaranarayanan, K. Skadron, and M. R. Stan, “Hotspot: A compact thermal modeling methodology for early-stage vlsi design,”IEEE Transactions on very large scale integration (VLSI) systems, vol. 14, no. 5, pp. 501–513, 2006

work page 2006
[35]

The alpha 21264 microprocessor,

R. E. Kessler, “The alpha 21264 microprocessor,”IEEE micro, vol. 19, no. 2, pp. 24–36, 1999

work page 1999
[36]

U-fno—an enhanced fourier neural operator-based deep-learning model for multiphase flow,

G. Wen, Z. Li, K. Azizzadenesheli, A. Anandkumar, and S. M. Benson, “U-fno—an enhanced fourier neural operator-based deep-learning model for multiphase flow,”Advances in Water Resources, vol. 163, p. 104180, 2022

work page 2022
[37]

T-fusion: Thermal modeling of 3d ics with multi-fidelity fusion,

B. Zhang, W. Xing, X. Zhao, and Y . Sun, “T-fusion: Thermal modeling of 3d ics with multi-fidelity fusion,” inProceedings of the 30th Asia and South Pacific Design Automation Conference, 2025, pp. 1406–1412

work page 2025
[38]

Pisov: Physics-informed separation of variables solvers for full-chip thermal analysis,

L. Chen, W. Zhu, M. Tang, S. X.-D. Tan, J.-F. Mao, and J. Zhang, “Pisov: Physics-informed separation of variables solvers for full-chip thermal analysis,”IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024

work page 2024

[1] [1]

From fluid dynamics to chip design: Pde foundation models address the data bottleneck in 3d-ic thermal simulation,

Z. Huang, H. Xin, D. Ma, Y . Wei, W. Yang, Y . Zhang, T.-J. Lin, W. W. Xing, and L. He, “From fluid dynamics to chip design: Pde foundation models address the data bottleneck in 3d-ic thermal simulation,” in Proceedings of the 63rd ACM/IEEE Design Automation Conference (DAC). IEEE, 2026, pp. 1–7, to be published

work page 2026

[2] [2]

Computationally efficient standard-cell fem-based thermal analysis,

Y .-C. Chen, S. Ladenheim, H. Kalargaris, M. Mihajlovi ´c, and V . F. Pavlidis, “Computationally efficient standard-cell fem-based thermal analysis,” in2017 IEEE/ACM International Conference on Computer- Aided Design (ICCAD). IEEE, 2017, pp. 490–495

work page 2017

[3] [3]

Efficient full-chip thermal modeling and analysis,

P. Li, L. T. Pileggi, M. Asheghi, and R. Chandra, “Efficient full-chip thermal modeling and analysis,” inIEEE/ACM International Conference on Computer Aided Design, 2004. ICCAD-2004.IEEE, 2004, pp. 319– 326

work page 2004

[4] [4]

Thermal-aware floorplanning and TSV-planning for mixed-type modules in a fixed-outline 3-d ic,

J.-M. Lin, W.-Y . Chang, H.-Y . Hsieh, Y .-T. Shyu, Y .-J. Chang, and J.- M. Lu, “Thermal-aware floorplanning and TSV-planning for mixed-type modules in a fixed-outline 3-d ic,”IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 29, no. 9, pp. 1652–1664, 2021. 14

work page 2021

[5] [5]

Deepoheat: operator learning-based ultra-fast thermal simulation in 3d- ic design,

Z. Liu, Y . Li, J. Hu, X. Yu, S. Shiau, X. Ai, Z. Zeng, and Z. Zhang, “Deepoheat: operator learning-based ultra-fast thermal simulation in 3d- ic design,” in2023 60th ACM/IEEE Design Automation Conference (DAC). IEEE, 2023, pp. 1–6

work page 2023

[6] [6]

Self-attention to operator learning-based 3d- ic thermal simulation,

Z. Huang, H. Wang, W. Yang, M. Tang, D. Xie, T.-J. Lin, Y . Zhang, W. W. Xing, and L. He, “Self-attention to operator learning-based 3d- ic thermal simulation,” in2025 62nd ACM/IEEE Design Automation Conference (DAC). IEEE, 2025, pp. 1–7

work page 2025

[7] [7]

A survey of chip-level thermal simulators,

H. Sultan, A. Chauhan, and S. R. Sarangi, “A survey of chip-level thermal simulators,”ACM Computing Surveys (CSUR), vol. 52, no. 2, pp. 1–35, 2019

work page 2019

[8] [8]

A stepwise integration separation of variables solver for full- chip thermal uncertainty analysis,

L. Yin, A. Wang, W. Zhu, A. Guo, J. Liu, M. Tang, L. Chen, and J. Zhang, “A stepwise integration separation of variables solver for full- chip thermal uncertainty analysis,”IEEE Transactions on Components, Packaging and Manufacturing Technology, 2024

work page 2024

[9] [9]

Dpot: Auto-regressive denoising operator transformer for large- scale pde pre-training.arXiv preprint arXiv:2403.03542,

Z. Hao, C. Su, S. Liu, J. Berner, C. Ying, H. Su, A. Anandkumar, J. Song, and J. Zhu, “Dpot: Auto-regressive denoising operator transformer for large-scale pde pre-training,”arXiv preprint arXiv:2403.03542, 2024

work page arXiv 2024

[10] [10]

Poseidon: Efficient foundation models for pdes,

M. Herde, B. Raonic, T. Rohner, R. K ¨appeli, R. Molinaro, E. de B´ezenac, and S. Mishra, “Poseidon: Efficient foundation models for pdes,”Ad- vances in Neural Information Processing Systems, vol. 37, pp. 72 525– 72 624, 2024

work page 2024

[11] [11]

2d-thermal: Physics- informed framework for thermal analysis of circuits using generative ai,

S. Chandra, S. S. Chowdhury, and K. Roy, “2d-thermal: Physics- informed framework for thermal analysis of circuits using generative ai,”IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2025

work page 2025

[12] [12]

Pindas: Physics-informed decoupled spa- tiotemporal artificial neural network for dynamic thermal simulation,

D. Coenen and H. Oprins, “Pindas: Physics-informed decoupled spa- tiotemporal artificial neural network for dynamic thermal simulation,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2025

work page 2025

[13] [13]

Estimation of steady-state temperature field in multichip modules using deep convolutional neural network,

Y . Hua, Z.-Q. Wang, X.-Y . Yuan, Y . B. Li, W.-T. Wu, and N. Aubry, “Estimation of steady-state temperature field in multichip modules using deep convolutional neural network,”Thermal Science and Engineering Progress, vol. 40, p. 101755, 2023

work page 2023

[14] [14]

Multi-fidelity surrogate modeling for temperature field prediction using deep convolution neural network,

Y . Zhang, Z. Gong, W. Zhou, X. Zhao, X. Zheng, and W. Yao, “Multi-fidelity surrogate modeling for temperature field prediction using deep convolution neural network,”Engineering Applications of Artificial Intelligence, vol. 123, p. 106354, 2023

work page 2023

[15] [15]

Transfer learning of convolutional neural network model for thermal estimation of multichip modules,

Z.-Q. Wang, Y . Hua, H.-R. Xie, Z.-F. Zhou, Y .-B. Li, and W.-T. Wu, “Transfer learning of convolutional neural network model for thermal estimation of multichip modules,”Case Studies in Thermal Engineering, vol. 59, p. 104576, 2024

work page 2024

[16] [16]

Fast full-chip parametric thermal analysis based on enhanced physics enforced neural networks,

L. Chen, J. Lu, W. Jin, and S. X.-D. Tan, “Fast full-chip parametric thermal analysis based on enhanced physics enforced neural networks,” in2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 2023, pp. 1–8

work page 2023

[17] [17]

Physics-informed learning for fast transient tsv electromigration analysis,

X. Yang, W. Zhu, Y . Zhang, Y . Xue, W. Sheng, P. Ren, R. Wang, Z. Ji, and H.-B. Chen, “Physics-informed learning for fast transient tsv electromigration analysis,”IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2026

work page 2026

[18] [18]

Pi-onet: A physics-informed operator network for efficient thermal analysis of multilayer chiplets,

Y . Sha, C. Zhang, and Q. Chen, “Pi-onet: A physics-informed operator network for efficient thermal analysis of multilayer chiplets,”IEEE Transactions on Components, Packaging and Manufacturing Technol- ogy, 2025

work page 2025

[19] [19]

Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,

M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational physics, vol. 378, pp. 686–707, 2019

work page 2019

[20] [20]

Physics- informed neural networks for heat transfer problems,

S. Cai, Z. Wang, S. Wang, P. Perdikaris, and G. E. Karniadakis, “Physics- informed neural networks for heat transfer problems,”Journal of Heat Transfer, vol. 143, no. 6, p. 060801, 2021

work page 2021

[21] [21]

Asrr-pinn: Adaptive sub-regional random resampling-based pinn for thermal analysis of 3d-ics,

Z. Zhou, M. Tang, and L. Chen, “Asrr-pinn: Adaptive sub-regional random resampling-based pinn for thermal analysis of 3d-ics,” in2025 62nd ACM/IEEE Design Automation Conference (DAC). IEEE, 2025, pp. 1–7

work page 2025

[22] [22]

Fourier Neural Operator for Parametric Partial Differential Equations

Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anandkumar, “Fourier neural operator for parametric partial differential equations,”arXiv preprint arXiv:2010.08895, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2010

[23] [23]

DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators

L. Lu, P. Jin, and G. E. Karniadakis, “Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators,”arXiv preprint arXiv:1910.03193, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1910

[24] [24]

Full-chip thermal map estimation for commercial multi-core cpus with generative adver- sarial learning,

W. Jin, S. Sadiqbatcha, J. Zhang, and S. X.-D. Tan, “Full-chip thermal map estimation for commercial multi-core cpus with generative adver- sarial learning,” inProceedings of the 39th International Conference on Computer-Aided Design, 2020, pp. 1–9

work page 2020

[25] [25]

Real-time thermal map estimation for amd multi-core cpus using transformer,

J. Lu, J. Zhang, and S. X.-D. Tan, “Real-time thermal map estimation for amd multi-core cpus using transformer,” in2023 IEEE/ACM Inter- national Conference on Computer Aided Design (ICCAD). IEEE, 2023, pp. 1–7

work page 2023

[26] [26]

Real-time thermal map estimation for amd multi-core cpus using transformer,

J. Lu, J. Zhang, and S. X. Tan, “Real-time thermal map estimation for amd multi-core cpus using transformer,” in2023 IEEE/ACM Interna- tional Conference on Computer Aided Design (ICCAD). IEEE, 2023, pp. 1–7

work page 2023

[27] [27]

Fast machine learning based prediction for temperature simulation using compact models,

M. Hajikhodaverdian, S. Reda, and A. K. Coskun, “Fast machine learning based prediction for temperature simulation using compact models,” in2025 Design, Automation & Test in Europe Conference (DATE). IEEE, 2025, pp. 1–2

work page 2025

[28] [28]

Fast steady-state thermal analysis with separation of variables and discrete cosine transform,

H. Ai, L. Chen, J. Zhang, B. Yu, and W. Zhu, “Fast steady-state thermal analysis with separation of variables and discrete cosine transform,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2025

work page 2025

[29] [29]

Fasttherm: Fast and stable full-chip transient thermal predictor considering nonlinear effects,

T. Zhu, Q. Wang, Y . Lin, R. Wang, and R. Huang, “Fasttherm: Fast and stable full-chip transient thermal predictor considering nonlinear effects,” inProceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, 2024, pp. 1–9

work page 2024

[30] [30]

T. C. Choy,Effective Medium Theory: Principles and Applications. Oxford University Press, 12 2015. [Online]. Available: https: //doi.org/10.1093/acprof:oso/9780198705093.001.0001

work page doi:10.1093/acprof:oso/9780198705093.001.0001 2015

[31] [31]

Equivalent inclusion method for steady state heat conduction in composites,

H. Hiroshi and T. Minoru, “Equivalent inclusion method for steady state heat conduction in composites,”International Journal of Engineering Science, vol. 24, no. 7, pp. 1159–1172, 1986

work page 1986

[32] [32]

A novel effective medium theory for modelling the thermal conductivity of porous materials,

L. Gong, Y . Wang, X. Cheng, R. Zhang, and H. Zhang, “A novel effective medium theory for modelling the thermal conductivity of porous materials,”International Journal of Heat and Mass Transfer, vol. 68, pp. 295–298, 2014

work page 2014

[33] [33]

Aro: Autoregressive operator learning for transferable and multi-fidelity 3d- ic thermal analysis with active learning,

M. Wang, Y . Cheng, W. Zeng, Z. Lu, V . F. Pavlidis, and W. Xing, “Aro: Autoregressive operator learning for transferable and multi-fidelity 3d- ic thermal analysis with active learning,” inProceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, 2024, pp. 1–9

work page 2024

[34] [34]

Hotspot: A compact thermal modeling methodology for early-stage vlsi design,

W. Huang, S. Ghosh, S. Velusamy, K. Sankaranarayanan, K. Skadron, and M. R. Stan, “Hotspot: A compact thermal modeling methodology for early-stage vlsi design,”IEEE Transactions on very large scale integration (VLSI) systems, vol. 14, no. 5, pp. 501–513, 2006

work page 2006

[35] [35]

The alpha 21264 microprocessor,

R. E. Kessler, “The alpha 21264 microprocessor,”IEEE micro, vol. 19, no. 2, pp. 24–36, 1999

work page 1999

[36] [36]

U-fno—an enhanced fourier neural operator-based deep-learning model for multiphase flow,

G. Wen, Z. Li, K. Azizzadenesheli, A. Anandkumar, and S. M. Benson, “U-fno—an enhanced fourier neural operator-based deep-learning model for multiphase flow,”Advances in Water Resources, vol. 163, p. 104180, 2022

work page 2022

[37] [37]

T-fusion: Thermal modeling of 3d ics with multi-fidelity fusion,

B. Zhang, W. Xing, X. Zhao, and Y . Sun, “T-fusion: Thermal modeling of 3d ics with multi-fidelity fusion,” inProceedings of the 30th Asia and South Pacific Design Automation Conference, 2025, pp. 1406–1412

work page 2025

[38] [38]

Pisov: Physics-informed separation of variables solvers for full-chip thermal analysis,

L. Chen, W. Zhu, M. Tang, S. X.-D. Tan, J.-F. Mao, and J. Zhang, “Pisov: Physics-informed separation of variables solvers for full-chip thermal analysis,”IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024

work page 2024