NUCLEUS-MoE: Unified Model of Pool Boiling for Liquid Cooling

Aparna Chandramowlishwaran; Arthur Feeney; Sheikh Md Shakeel Hassan; Siddhartha Rachabathuni; Xianwei Zou

arxiv: 2605.27722 · v1 · pith:HZJ5DP3Knew · submitted 2026-05-26 · 💻 cs.LG

NUCLEUS-MoE: Unified Model of Pool Boiling for Liquid Cooling

Arthur Feeney , Xianwei Zou , Sheikh Md Shakeel Hassan , Siddhartha Rachabathuni , Aparna Chandramowlishwaran This is my paper

Pith reviewed 2026-06-29 18:27 UTC · model grok-4.3

classification 💻 cs.LG

keywords pool boilingmixture of expertssurrogate modelingtwo-phase flowliquid coolinggeneralizationscientific machine learning

0 comments

The pith

A single mixture-of-experts model replaces separate surrogates for pool boiling across dielectrics, refrigerants, and cryogens.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces NUCLEUS, a mixture-of-experts architecture that jointly predicts saturated and subcooled pool boiling for three fluid classes using data from high-fidelity simulations. It combines neighborhood attention with signed distance field reinitialization to enforce interface consistency and lets expert routing develop specialization without explicit labels. The model matches or exceeds prior specialized surrogates on accuracy and physical consistency while showing zero-shot and few-shot transfer to an unseen fluid. If correct, this removes the need to train and maintain separate models for each fluid or condition in liquid-cooling applications.

Core claim

NUCLEUS is a mixture-of-experts surrogate that unifies saturated and subcooled boiling prediction across dielectrics, refrigerants, and cryogens. It employs neighborhood attention and signed distance field reinitialization for interface consistency, with expert routing that develops coherent spatial specialization without supervision. Trained on high-fidelity simulations, the model matches or exceeds baseline performance on heterogeneous configurations, preserves physical consistency, and demonstrates zero-shot and few-shot generalization to a new fluid such as Opteon 2P50.

What carries the argument

Mixture-of-experts routing combined with neighborhood attention and signed distance field reinitialization, where routing produces emergent specialization across boiling regimes.

If this is right

One trained model covers saturated and subcooled regimes for dielectrics, refrigerants, and cryogens instead of requiring separate models.
Expert routing develops spatial structure and regime specialization without explicit supervision.
The architecture maintains physical consistency while matching or exceeding prior surrogates on test configurations.
Zero-shot and few-shot transfer is possible to a new fluid developed for immersion cooling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same routing mechanism could be tested on other multiphase transport problems where dynamics change sharply with fluid properties.
If the emergent specialization aligns with known boiling regimes, it may reduce the need for hand-crafted regime detection in future surrogates.
Generalization results suggest the model could serve as a starting point for online adaptation when new fluids are introduced in cooling systems.

Load-bearing premise

The high-fidelity simulations used for training accurately capture the real physics of boiling for the fluids and conditions considered.

What would settle it

A direct comparison on an unseen fluid or condition where NUCLEUS produces larger errors or violates conservation laws more than a set of fluid-specific baselines.

Figures

Figures reproduced from arXiv: 2605.27722 by Aparna Chandramowlishwaran, Arthur Feeney, Sheikh Md Shakeel Hassan, Siddhartha Rachabathuni, Xianwei Zou.

**Figure 1.** Figure 1: NUCLEUS unifies saturated and subcooled boiling across multiple fluids within a single architecture. (a) Physical fields reveal different dynamics: saturated boiling (top) shows concentrated evaporation and rising large bubbles, while subcooled boiling (bottom) exhibits bulk condensation resulting in turbulent vortices and smaller bubbles. (b) Mixture-of-experts (MoE) routing patterns show emergent special… view at source ↗

**Figure 2.** Figure 2: Empirical validation of ViT-style global attention [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: NUCLEUS Architecture. Spatiotemporal patches of the state 𝑆 = (𝑇 ,𝑈 , 𝜙) are input to a transformer backbone with temporal attention followed by spatial neighborhood attention, enforcing locality aligned with physical interactions. Fluidspecific parameters are input via FiLM conditioning. MoE routes patches to top-k MLP experts, enabling learned specialization of phase-change behaviors. The model predicts… view at source ↗

**Figure 4.** Figure 4: Spatiotemporal expert specialization in subcooled R515B boiling. Temperature fields at four timesteps overlaid with [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Compares the signed distance function [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Mean absolute error (MAE) heatmaps for (top) saturated boiling and (bottom) subcooled boiling. Rows correspond to [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: Distribution of temperature and vertical velocity for subcooled boiling over 100 timesteps autoregressive rollout. [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Few-shot adaption of NUCLEUS to subcooled OP2P50 boiling using only three simulations. Left: Temperature and velocity distributions for an unseen rollout with a heater temperature of 97°C. Right: Example autoregressive rollout after finetuning, demonstrating stable interface, thermal transport, and velocity evolution despite limited finetuning data. developed and remain widely used in engineering practice … view at source ↗

**Figure 9.** Figure 9: Comparison of training NUCLEUS from scratch on OP2P50 versus finetuning a pretrained model. Shown is the single-step relative L2 error for 25 random samples from the OP2P50 test simulations [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗

**Figure 10.** Figure 10: Comparison of MoE-DPOT, Poseidon, and NUCLEUS. Both MoE-DPOT and Poseidon exhibit diffused thermal structures and loss of interface sharpness even at single inference step. During autoregressive rollouts, errors compound and performance deteriorates rapidly. In contrast, NUCLEUS better preserves coherent thermal and interface structures [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗

read the original abstract

Two-phase boiling enables heat transfer rates an order of magnitude higher than single-phase cooling, but it remains difficult to model due to the strong coupling between phase change, turbulence, and transport, as well as extreme sensitivity to fluid properties and thermodynamic conditions. Existing learning-based surrogates are either condition- or fluid-specific, limiting generalization and requiring separate models. We present NUCLEUS, a mixture-of-experts model for pool boiling that replaces collections of specialized surrogates with a single architecture. NUCLEUS combines neighborhood attention, signed distance field reinitialization for interface consistency, and expert routing that exhibits emergent specialization across distinct boiling dynamics. Trained on high-fidelity simulations of pool boiling, NUCLEUS jointly models saturated and subcooled boiling across three fluid classes (dielectrics, refrigerants, and cryogens), resolving failure modes of prior models on extreme fluids. We show that expert routing exhibits coherent spatial structure and specialization without explicit supervision. Quantitatively, NUCLEUS matches or exceeds baselines while maintaining physical consistency across heterogeneous boiling configurations. We also show zero-shot and few-shot generalization capabilities on downstream tasks such as a new fluid (Opteon 2P50 developed for immersion cooling). These results demonstrate that mixture-of-experts models are a scalable pathway toward unified surrogate modeling of boiling dynamics and lay the groundwork for broader generalization across scientific ML.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

NUCLEUS-MoE gives a single MoE architecture for pool boiling across fluid classes on simulations, with emergent routing, but the gains rest on unverified sim accuracy for cryogens.

read the letter

The main point is that this paper builds one mixture-of-experts model to cover saturated and subcooled boiling for dielectrics, refrigerants, and cryogens instead of separate surrogates per fluid or condition. It adds neighborhood attention and signed distance field reinitialization, then lets the experts develop specialization on their own during training on high-fidelity simulations.

The architecture choices and the reported zero-shot and few-shot results on a new fluid (Opteon 2P50) are the concrete advances. The paper shows the routing develops coherent spatial patterns without explicit labels, and the model matches or beats baselines on the simulation test cases while keeping some physical consistency. That is useful incremental work in scientific ML for thermal systems.

The soft spot is exactly the one the stress-test note flags. Everything is trained and evaluated on simulations, with no experimental heat-transfer data or measured coefficients referenced for the cryogen cases or high-heat-flux regimes. If those simulations miss key turbulence or phase-change effects in extreme fluids, the performance numbers and the generalization claims become harder to trust. The zero-shot test is also on simulated data for the new fluid, not hardware.

This is aimed at researchers building surrogates for multiphase heat transfer and electronics cooling. Readers who already trust the underlying simulation codes will get the most out of the architecture details and routing analysis.

It deserves peer review. The unified modeling direction is worth referee scrutiny even with the validation gap, and reviewers can ask for the missing experimental comparisons or sensitivity checks.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces NUCLEUS-MoE, a single mixture-of-experts architecture combining neighborhood attention and signed-distance-field reinitialization to jointly model saturated and subcooled pool boiling across dielectrics, refrigerants, and cryogens. Trained exclusively on high-fidelity simulations, the model claims to match or exceed prior baselines, exhibit emergent expert specialization without supervision, preserve physical consistency, and demonstrate zero-shot/few-shot generalization to an unseen fluid (Opteon 2P50).

Significance. If the simulation-to-reality gap is closed and the reported generalization holds, the work would demonstrate that MoE routing can capture heterogeneous multiphase physics at scale without hand-crafted per-fluid models, offering a concrete path toward unified surrogates in thermal-fluid engineering.

major comments (2)

[Methods (training data generation) and Results (quantitative evaluation)] The central performance and generalization claims rest on the premise that the high-fidelity pool-boiling simulations accurately capture coupled phase-change/turbulence physics for cryogens. No section provides direct quantitative comparison of simulated heat-transfer coefficients or bubble statistics against experimental measurements for any cryogen in the high-heat-flux or low-temperature regime; without such validation the reported gains over baselines and the emergent specialization could be artifacts of the synthetic data distribution.
[Results (expert routing)] § on expert routing analysis: the claim that routing exhibits 'coherent spatial structure and specialization' corresponding to distinct boiling dynamics is presented qualitatively. No quantitative metric (e.g., mutual information between router logits and local heat-flux regime labels, or ablation showing performance drop when routing is randomized) is supplied to demonstrate that the specialization is functionally meaningful rather than incidental.

minor comments (2)

[Model architecture] Notation for the signed-distance-field reinitialization step should be made explicit (e.g., the precise reinitialization equation and frequency) so that reproducibility is possible from the text alone.
[Abstract and Results] The abstract states that NUCLEUS 'resolves failure modes of prior models on extreme fluids,' but the manuscript does not tabulate the specific failure modes (e.g., divergence, unphysical negative temperatures) that each baseline exhibited on the cryogen test cases.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments. We address each major point below and indicate where revisions will be made to the manuscript.

read point-by-point responses

Referee: [Methods (training data generation) and Results (quantitative evaluation)] The central performance and generalization claims rest on the premise that the high-fidelity pool-boiling simulations accurately capture coupled phase-change/turbulence physics for cryogens. No section provides direct quantitative comparison of simulated heat-transfer coefficients or bubble statistics against experimental measurements for any cryogen in the high-heat-flux or low-temperature regime; without such validation the reported gains over baselines and the emergent specialization could be artifacts of the synthetic data distribution.

Authors: We agree that the manuscript does not contain direct quantitative comparisons between the high-fidelity simulations and experimental measurements specifically for cryogens in the high-heat-flux or low-temperature regimes. The underlying solver is drawn from established numerical methods whose validation against experiments is documented in the cited prior literature for a range of fluids and conditions; however, we acknowledge that this does not constitute new, direct validation within the present work for the cryogen cases at the extremes of the parameter space. We will revise the Methods section to include an expanded discussion of the simulation validation status, explicitly note the simulation-to-experiment gap as a limitation, and clarify that all performance and generalization claims are made within the simulation domain. These changes will be reflected in the revised manuscript. revision: yes
Referee: [Results (expert routing)] § on expert routing analysis: the claim that routing exhibits 'coherent spatial structure and specialization' corresponding to distinct boiling dynamics is presented qualitatively. No quantitative metric (e.g., mutual information between router logits and local heat-flux regime labels, or ablation showing performance drop when routing is randomized) is supplied to demonstrate that the specialization is functionally meaningful rather than incidental.

Authors: The expert routing analysis in the current manuscript relies on qualitative visualization of routing patterns and their spatial correspondence to boiling regimes. We will strengthen this section by adding quantitative metrics, including mutual information between router logits and local heat-flux regime labels derived from the simulation data, as well as an ablation experiment in which routing is replaced by random assignment to quantify the resulting performance drop. These additions will be included in the revised Results section on expert routing. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical ML training and generalization claims are self-contained

full rationale

The paper describes an MoE architecture trained on high-fidelity pool-boiling simulations, with performance claims resting on quantitative matches to baselines, physical consistency checks, and zero/few-shot tests on a held-out fluid (Opteon 2P50). No equations, uniqueness theorems, or fitted-parameter renamings are presented that would reduce any reported prediction to the training inputs by construction. No self-citations are invoked as load-bearing premises for the architecture or results. The derivation chain is therefore the standard supervised-learning pipeline (train on sims, evaluate on generalization tasks) and remains externally falsifiable.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, no specific free parameters, axioms, or invented entities are detailed beyond standard neural network training. The model relies on learned expert routing but introduces no new physical entities.

pith-pipeline@v0.9.1-grok · 5800 in / 1340 out tokens · 46755 ms · 2026-06-29T18:27:37.141173+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 14 canonical work pages · 4 internal anchors

[1]

Mohammad Azarifar, Mehmet Arik, and Je-Young Chang. 2024. Liquid cooling of data centers: A necessity facing challenges.Applied Thermal Engineering247 (2024), 123112

2024
[2]

Kamyar Azizzadenesheli, Nikola Kovachki, Zongyi Li, Miguel Liu-Schiaffini, Jean Kossaifi, and Anima Anandkumar. 2024. Neural operators for accelerating scientific simulations and design.Nature Reviews Physics6, 5 (2024), 320–328

2024
[3]

Steven L Brunton, Bernd R Noack, and Petros Koumoutsakos. 2020. Machine learning for fluid mechanics.Annual review of fluid mechanics52, 1 (2020), 477–508

2020
[4]

Matteo Bucci, Andrew Richenderfer, Guan-Yu Su, Thomas McKrell, and Jacopo Buongiorno. 2016. A mechanistic IR calibration technique for boiling heat transfer investigations.International Journal of Multiphase Flow83 (2016), 115–127

2016
[5]

Krishnapriyan

Nithin Chalapathi, Yiheng Du, and Aditi S. Krishnapriyan. 2024. Scaling physics- informed hard constraints with mixture-of-experts. InThe Twelfth International Conference on Learning Representations. https://openreview.net/forum?id= u3dX2CEIZb

2024
[6]

Damai Dai, Chengqi Deng, Chenggang Zhao, RX Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Yu Wu, et al. 2024. Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models.arXiv preprint arXiv:2401.06066(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[7]

Vijay K Dhir, Gopinath R Warrier, and Eduardo Aktinol. 2013. Numerical simula- tion of pool boiling: a review.Journal of Heat Transfer135, 6 (2013), 061502

2013
[8]

Akash Dhruv, Elias Balaras, Amir Riaz, and Jungho Kim. 2019. A formulation for high-fidelity simulations of pool boiling in low gravity.International Journal of Multiphase Flow120 (2019), 103099

2019
[9]

Jaco Dirker, Diksha Juggurnath, Alihan Kaya, Emmanuel A Osowade, Michael Simpson, Steven Lecompte, Seyyed Mohammad Ali Noori Rahim Abadi, Victor Voulgaropoulos, Adekunle O Adelaja, M Zaid Dauhoo, et al . 2019. Thermal energy processes in direct steam generation solar systems: Boiling, condensation and energy storage–A review.Frontiers in Energy Research6 ...

2019
[10]

Alexey Dosovitskiy. 2020. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929(2020)

work page internal anchor Pith review Pith/arXiv arXiv 2020
[11]

Austin Harris, Tom Klosterman, Rajeev Jain, Johann Rudi, Bronson Messer, Michael Pajkos, Jared Carlson, Ran Chu, Mohamed Wahib, Saurabh Chawdhary, Paul M

Anshu Dubey, Klaus Weide, Jared O’Neal, Akash Dhruv, Sean Couch, J. Austin Harris, Tom Klosterman, Rajeev Jain, Johann Rudi, Bronson Messer, Michael Pajkos, Jared Carlson, Ran Chu, Mohamed Wahib, Saurabh Chawdhary, Paul M. Ricker, Dongwook Lee, Katie Antypas, Katherine M. Riley, Christopher Daley, Murali Ganapathy, Francis X. Timmes, Dean M. Townsley, Mar...

work page doi:10.1016/j.softx.2022.101168 2022
[12]

Veronika Eyring, William D Collins, Pierre Gentine, Elizabeth A Barnes, Marcelo Barreiro, Tom Beucler, Marc Bocquet, Christopher S Bretherton, Hannah M Christensen, Katherine Dagon, et al . 2024. Pushing the frontiers in climate modelling and analysis with machine learning.Nature Climate Change14, 9 (2024), 916–928

2024
[13]

2019.Fundamentals of Multiphase Heat Transfer and Flow

Amir Faghri and Yuwen Zhang. 2019.Fundamentals of Multiphase Heat Transfer and Flow. Springer Nature

2019
[14]

William Fedus, Barret Zoph, and Noam Shazeer. 2022. Switch Transform- ers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. arXiv:2101.03961 [cs.LG] https://arxiv.org/abs/2101.03961

work page internal anchor Pith review Pith/arXiv arXiv 2022
[15]

1962.The mechanism of heat transfer in nucleate pool boiling

Chi-Yeh Han. 1962.The mechanism of heat transfer in nucleate pool boiling. Ph. D. Dissertation. Massachusetts Institute of Technology

1962
[16]

Sheikh Md Shakeel Hassan, Arthur Feeney, Akash Dhruv, Jihoon Kim, Youngjoon Suh, Jaiyoung Ryu, Yoonjin Won, and Aparna Chandramowlishwaran. 2023. Bub- bleml: A multiphase multiphysics dataset and benchmarks for machine learning. Advances in Neural Information Processing Systems36 (2023), 418–449

2023
[17]

Sheikh Md Shakeel Hassan, Xianwei Zou, Akash Dhruv, and Aparna Chan- dramowlishwaran. 2026. Bubbleformer: Forecasting Boiling with Transformers. Advances in Neural Information Processing Systems38 (2026)

2026
[18]

Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi. 2023. Neigh- borhood attention transformer. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 6185–6194

2023
[19]

Maximilian Herde, Bogdan Raonic, Tobias Rohner, Roger Käppeli, Roberto Moli- naro, Emmanuel de Bézenac, and Siddhartha Mishra. 2024. Poseidon: Efficient foundation models for pdes.Advances in Neural Information Processing Systems 37 (2024), 72525–72624

2024
[20]

Siavash Khodakarami, Vivek Oommen, Aniruddha Bora, and George Em Kar- niadakis. 2025. Mitigating spectral bias in neural operators via high-frequency scaling for physical systems.arXiv preprint arXiv:2503.13695(2025)

work page arXiv 2025
[21]

Jean Kossaifi, Nikola Kovachki, Morteza Mardani, Daniel Leibovici, Suman Ravuri, Ira Shokar, Edoardo Calvello, Mohammad Shoaib Abbas, Peter Harrington, Ashay Subramaniam, et al. 2026. Demystifying Data-Driven Probabilistic Medium-Range Weather Forecasting.arXiv preprint arXiv:2601.18111(2026)

work page arXiv 2026
[22]

Jakub Krajewski, Jan Ludziejewski, Kamil Adamczewski, Maciej Pióro, Michał Krutul, Szymon Antoniak, Kamil Ciebiera, Krystian Król, Tomasz Odrzygóźdź, Piotr Sankowski, et al. 2024. Scaling laws for fine-grained mixture of experts. arXiv preprint arXiv:2402.07871(2024)

work page arXiv 2024
[23]

2026.Data Centers and Their Energy Consumption: Frequently Asked Questions

Ashley Lawson, Martin Offutt, Natalie Ortiz, and Ling Zhu. 2026.Data Centers and Their Energy Consumption: Frequently Asked Questions. Technical Report R48646. U.S. Congress. https://www.congress.gov/crs-product/R48646

2026
[24]

Eric W Lemmon, Ian H Bell, ML Huber, and MO McLinden. 2018. NIST standard reference database 23: reference fluid thermodynamic and transport properties- REFPROP, Version 10.0, National Institute of Standards and Technology.Standard Reference Data Program, Gaithersburg(2018), 45–46

2018
[25]

Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. 2021. Fourier Neural Operator for Parametric Partial Differential Equations. arXiv:2010.08895 [cs.LG] https://arxiv.org/abs/2010.08895

work page internal anchor Pith review Pith/arXiv arXiv 2021
[26]

Veeling, Paris Perdikaris, Richard E

Phillip Lippe, Bastiaan S. Veeling, Paris Perdikaris, Richard E. Turner, and Jo- hannes Brandstetter. 2023. PDE-Refiner: Achieving Accurate Long Rollouts with Neural PDE Solvers. arXiv:2308.05732 [cs.LG] https://arxiv.org/abs/2308.05732

work page arXiv 2023
[27]

Yang Liu, Chengqi Wang, Yalan Qian, and Xiaodong Sun. 2020. Uncertainty analysis of PIV measurements in bubbly flows considering sampling and bubble effects with ray optics modeling.Nuclear Engineering and Design364 (2020), 110677

2020
[28]

Shiro Nukiyama. 1966. The maximum and minimum values of the heat Q trans- mitted from metal to boiling water under atmospheric pressure.International Journal of Heat and Mass Transfer9, 12 (1966), 1419–1433

1966
[29]

Johan Obando-Ceron, Ghada Sokar, Timon Willi, Clare Lyle, Jesse Farebrother, Jakob Foerster, Gintare Karolina Dziugaite, Doina Precup, and Pablo Samuel Castro. 2024. Mixtures of experts unlock parameter scaling for deep rl.arXiv preprint arXiv:2402.08609(2024)

work page arXiv 2024
[30]

2003.Level Set Methods and Dynamic Implicit Surfaces

Stanley Osher and Ronald Fedkiw. 2003.Level Set Methods and Dynamic Implicit Surfaces. Springer-Verlang New York Inc

2003
[31]

Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. 2018. Film: Visual reasoning with a general conditioning layer. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32

2018
[32]

Yoeri Poels, Koen Minartz, Harshit Bansal, and Vlado Menkovski. 2024. Acceler- ating Simulation of Two-Phase Flows with Neural PDE Surrogates. InICML 2024 AI for Science Workshop. https://openreview.net/forum?id=yIqszw9RUc

2024
[33]

Stephen B. Pope. 2000.Turbulent Flows. Cambridge University Press

2000
[34]

Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, André Susano Pinto, Daniel Keysers, and Neil Houlsby. 2021. Scaling vision with sparse mixture of experts.Advances in Neural Information Processing Systems34 (2021), 8583–8595

2021
[35]

Warren M Rohsenow. 1952. A method of correlating heat-transfer data for surface boiling of liquids.Transactions of the American Society of Mechanical Engineers 74, 6 (1952), 969–975

1952
[36]

Yohei Sato and Bojan Niceno. 2018. Pool boiling simulation using an interface tracking method: From nucleate boiling to film boiling regime through critical heat flux.International Journal of Heat and Mass Transfer125 (2018), 876–890

2018
[37]

Mehdi Shadkhah, Ronak Tali, Ali Rabeh, Cheng-Hau Yang, Ethan Herron, Ab- hisek Upadhyaya, Adarsh Krishnamurthy, Chinmay Hegde, Aditya Balu, and Baskar Ganapathysubramanian. 2025. MPFBench: A Large Scale Dataset for SciML of Multi-Phase-Flows: Droplet and Bubble Dynamics.arXiv preprint arXiv:2502.07080(2025)

work page arXiv 2025
[38]

Noam Shazeer, *Azalia Mirhoseini, *Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. 2017. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. InInternational Conference on Learning Representations. https://openreview.net/forum?id=B1ckMDqlg

2017
[39]

Mark Sussman and Emad Fatemi. 1999. An Efficient, Interface-Preserving Level Set Redistancing Algorithm and Its Application to Interfacial Incompressible Fluid Flow.SIAM Journal on Scientific Computing20, 4 (1999), 1165–1191. arXiv:https://doi.org/10.1137/S1064827596298245 doi:10.1137/S1064827596298245

work page doi:10.1137/s1064827596298245 1999
[40]

Alasdair Tran, Alexander Mathews, Lexing Xie, and Cheng Soon Ong. 2023. Factorized Fourier Neural Operators. InThe Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=tmIiMPl4IPa

2023
[41]

Hong Wang, Haiyang Xin, Jie Wang, Xuanze Yang, Fei Zha, huanshuo dong, and Yan Jiang. 2025. Mixture-of-Experts Operator Transformer for Large-Scale PDE Pre-Training. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems. https://openreview.net/forum?id=PNgG4H3q9D

2025
[42]

Peihao Wang, Wenqing Zheng, Tianlong Chen, and Zhangyang Wang. 2022. Anti- Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice. InInternational Conference on Learning Representations. https://openreview.net/forum?id=O476oWmiNNp

2022
[43]

Zhenzhong Wang, Xin Zhang, Jun Liao, and Min Jiang. 2025. Cross-Field Interface-Aware Neural Operators for Multiphase Flow Simulation.arXiv preprint arXiv:2511.08625(2025)

work page arXiv 2025
[44]

Haixu Wu, Huakun Luo, Haowen Wang, Jianmin Wang, and Mingsheng Long
[45]

In Forty-first International Conference on Machine Learning

Transolver: A Fast Transformer Solver for PDEs on General Geometries. In Forty-first International Conference on Machine Learning
[46]

1959.Hydrodynamic aspects of boiling heat transfer

Novak Zuber. 1959.Hydrodynamic aspects of boiling heat transfer. Number 4439. United States Atomic Energy Commission, Technical Information Service. NUCLEUS-MoE: Unified Model of Pool Boiling for Liquid Cooling A BubbleML Dataset To test the generalization ofNUCLEUSto physical scenarios not seen during pretraining, we generate out of distribution datasets...

work page arXiv 1959

[1] [1]

Mohammad Azarifar, Mehmet Arik, and Je-Young Chang. 2024. Liquid cooling of data centers: A necessity facing challenges.Applied Thermal Engineering247 (2024), 123112

2024

[2] [2]

Kamyar Azizzadenesheli, Nikola Kovachki, Zongyi Li, Miguel Liu-Schiaffini, Jean Kossaifi, and Anima Anandkumar. 2024. Neural operators for accelerating scientific simulations and design.Nature Reviews Physics6, 5 (2024), 320–328

2024

[3] [3]

Steven L Brunton, Bernd R Noack, and Petros Koumoutsakos. 2020. Machine learning for fluid mechanics.Annual review of fluid mechanics52, 1 (2020), 477–508

2020

[4] [4]

Matteo Bucci, Andrew Richenderfer, Guan-Yu Su, Thomas McKrell, and Jacopo Buongiorno. 2016. A mechanistic IR calibration technique for boiling heat transfer investigations.International Journal of Multiphase Flow83 (2016), 115–127

2016

[5] [5]

Krishnapriyan

Nithin Chalapathi, Yiheng Du, and Aditi S. Krishnapriyan. 2024. Scaling physics- informed hard constraints with mixture-of-experts. InThe Twelfth International Conference on Learning Representations. https://openreview.net/forum?id= u3dX2CEIZb

2024

[6] [6]

Damai Dai, Chengqi Deng, Chenggang Zhao, RX Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Yu Wu, et al. 2024. Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models.arXiv preprint arXiv:2401.06066(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[7] [7]

Vijay K Dhir, Gopinath R Warrier, and Eduardo Aktinol. 2013. Numerical simula- tion of pool boiling: a review.Journal of Heat Transfer135, 6 (2013), 061502

2013

[8] [8]

Akash Dhruv, Elias Balaras, Amir Riaz, and Jungho Kim. 2019. A formulation for high-fidelity simulations of pool boiling in low gravity.International Journal of Multiphase Flow120 (2019), 103099

2019

[9] [9]

Jaco Dirker, Diksha Juggurnath, Alihan Kaya, Emmanuel A Osowade, Michael Simpson, Steven Lecompte, Seyyed Mohammad Ali Noori Rahim Abadi, Victor Voulgaropoulos, Adekunle O Adelaja, M Zaid Dauhoo, et al . 2019. Thermal energy processes in direct steam generation solar systems: Boiling, condensation and energy storage–A review.Frontiers in Energy Research6 ...

2019

[10] [10]

Alexey Dosovitskiy. 2020. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929(2020)

work page internal anchor Pith review Pith/arXiv arXiv 2020

[11] [11]

Austin Harris, Tom Klosterman, Rajeev Jain, Johann Rudi, Bronson Messer, Michael Pajkos, Jared Carlson, Ran Chu, Mohamed Wahib, Saurabh Chawdhary, Paul M

Anshu Dubey, Klaus Weide, Jared O’Neal, Akash Dhruv, Sean Couch, J. Austin Harris, Tom Klosterman, Rajeev Jain, Johann Rudi, Bronson Messer, Michael Pajkos, Jared Carlson, Ran Chu, Mohamed Wahib, Saurabh Chawdhary, Paul M. Ricker, Dongwook Lee, Katie Antypas, Katherine M. Riley, Christopher Daley, Murali Ganapathy, Francis X. Timmes, Dean M. Townsley, Mar...

work page doi:10.1016/j.softx.2022.101168 2022

[12] [12]

Veronika Eyring, William D Collins, Pierre Gentine, Elizabeth A Barnes, Marcelo Barreiro, Tom Beucler, Marc Bocquet, Christopher S Bretherton, Hannah M Christensen, Katherine Dagon, et al . 2024. Pushing the frontiers in climate modelling and analysis with machine learning.Nature Climate Change14, 9 (2024), 916–928

2024

[13] [13]

2019.Fundamentals of Multiphase Heat Transfer and Flow

Amir Faghri and Yuwen Zhang. 2019.Fundamentals of Multiphase Heat Transfer and Flow. Springer Nature

2019

[14] [14]

William Fedus, Barret Zoph, and Noam Shazeer. 2022. Switch Transform- ers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. arXiv:2101.03961 [cs.LG] https://arxiv.org/abs/2101.03961

work page internal anchor Pith review Pith/arXiv arXiv 2022

[15] [15]

1962.The mechanism of heat transfer in nucleate pool boiling

Chi-Yeh Han. 1962.The mechanism of heat transfer in nucleate pool boiling. Ph. D. Dissertation. Massachusetts Institute of Technology

1962

[16] [16]

Sheikh Md Shakeel Hassan, Arthur Feeney, Akash Dhruv, Jihoon Kim, Youngjoon Suh, Jaiyoung Ryu, Yoonjin Won, and Aparna Chandramowlishwaran. 2023. Bub- bleml: A multiphase multiphysics dataset and benchmarks for machine learning. Advances in Neural Information Processing Systems36 (2023), 418–449

2023

[17] [17]

Sheikh Md Shakeel Hassan, Xianwei Zou, Akash Dhruv, and Aparna Chan- dramowlishwaran. 2026. Bubbleformer: Forecasting Boiling with Transformers. Advances in Neural Information Processing Systems38 (2026)

2026

[18] [18]

Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi. 2023. Neigh- borhood attention transformer. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 6185–6194

2023

[19] [19]

Maximilian Herde, Bogdan Raonic, Tobias Rohner, Roger Käppeli, Roberto Moli- naro, Emmanuel de Bézenac, and Siddhartha Mishra. 2024. Poseidon: Efficient foundation models for pdes.Advances in Neural Information Processing Systems 37 (2024), 72525–72624

2024

[20] [20]

Siavash Khodakarami, Vivek Oommen, Aniruddha Bora, and George Em Kar- niadakis. 2025. Mitigating spectral bias in neural operators via high-frequency scaling for physical systems.arXiv preprint arXiv:2503.13695(2025)

work page arXiv 2025

[21] [21]

Jean Kossaifi, Nikola Kovachki, Morteza Mardani, Daniel Leibovici, Suman Ravuri, Ira Shokar, Edoardo Calvello, Mohammad Shoaib Abbas, Peter Harrington, Ashay Subramaniam, et al. 2026. Demystifying Data-Driven Probabilistic Medium-Range Weather Forecasting.arXiv preprint arXiv:2601.18111(2026)

work page arXiv 2026

[22] [22]

Jakub Krajewski, Jan Ludziejewski, Kamil Adamczewski, Maciej Pióro, Michał Krutul, Szymon Antoniak, Kamil Ciebiera, Krystian Król, Tomasz Odrzygóźdź, Piotr Sankowski, et al. 2024. Scaling laws for fine-grained mixture of experts. arXiv preprint arXiv:2402.07871(2024)

work page arXiv 2024

[23] [23]

2026.Data Centers and Their Energy Consumption: Frequently Asked Questions

Ashley Lawson, Martin Offutt, Natalie Ortiz, and Ling Zhu. 2026.Data Centers and Their Energy Consumption: Frequently Asked Questions. Technical Report R48646. U.S. Congress. https://www.congress.gov/crs-product/R48646

2026

[24] [24]

Eric W Lemmon, Ian H Bell, ML Huber, and MO McLinden. 2018. NIST standard reference database 23: reference fluid thermodynamic and transport properties- REFPROP, Version 10.0, National Institute of Standards and Technology.Standard Reference Data Program, Gaithersburg(2018), 45–46

2018

[25] [25]

Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. 2021. Fourier Neural Operator for Parametric Partial Differential Equations. arXiv:2010.08895 [cs.LG] https://arxiv.org/abs/2010.08895

work page internal anchor Pith review Pith/arXiv arXiv 2021

[26] [26]

Veeling, Paris Perdikaris, Richard E

Phillip Lippe, Bastiaan S. Veeling, Paris Perdikaris, Richard E. Turner, and Jo- hannes Brandstetter. 2023. PDE-Refiner: Achieving Accurate Long Rollouts with Neural PDE Solvers. arXiv:2308.05732 [cs.LG] https://arxiv.org/abs/2308.05732

work page arXiv 2023

[27] [27]

Yang Liu, Chengqi Wang, Yalan Qian, and Xiaodong Sun. 2020. Uncertainty analysis of PIV measurements in bubbly flows considering sampling and bubble effects with ray optics modeling.Nuclear Engineering and Design364 (2020), 110677

2020

[28] [28]

Shiro Nukiyama. 1966. The maximum and minimum values of the heat Q trans- mitted from metal to boiling water under atmospheric pressure.International Journal of Heat and Mass Transfer9, 12 (1966), 1419–1433

1966

[29] [29]

Johan Obando-Ceron, Ghada Sokar, Timon Willi, Clare Lyle, Jesse Farebrother, Jakob Foerster, Gintare Karolina Dziugaite, Doina Precup, and Pablo Samuel Castro. 2024. Mixtures of experts unlock parameter scaling for deep rl.arXiv preprint arXiv:2402.08609(2024)

work page arXiv 2024

[30] [30]

2003.Level Set Methods and Dynamic Implicit Surfaces

Stanley Osher and Ronald Fedkiw. 2003.Level Set Methods and Dynamic Implicit Surfaces. Springer-Verlang New York Inc

2003

[31] [31]

Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. 2018. Film: Visual reasoning with a general conditioning layer. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32

2018

[32] [32]

Yoeri Poels, Koen Minartz, Harshit Bansal, and Vlado Menkovski. 2024. Acceler- ating Simulation of Two-Phase Flows with Neural PDE Surrogates. InICML 2024 AI for Science Workshop. https://openreview.net/forum?id=yIqszw9RUc

2024

[33] [33]

Stephen B. Pope. 2000.Turbulent Flows. Cambridge University Press

2000

[34] [34]

Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, André Susano Pinto, Daniel Keysers, and Neil Houlsby. 2021. Scaling vision with sparse mixture of experts.Advances in Neural Information Processing Systems34 (2021), 8583–8595

2021

[35] [35]

Warren M Rohsenow. 1952. A method of correlating heat-transfer data for surface boiling of liquids.Transactions of the American Society of Mechanical Engineers 74, 6 (1952), 969–975

1952

[36] [36]

Yohei Sato and Bojan Niceno. 2018. Pool boiling simulation using an interface tracking method: From nucleate boiling to film boiling regime through critical heat flux.International Journal of Heat and Mass Transfer125 (2018), 876–890

2018

[37] [37]

Mehdi Shadkhah, Ronak Tali, Ali Rabeh, Cheng-Hau Yang, Ethan Herron, Ab- hisek Upadhyaya, Adarsh Krishnamurthy, Chinmay Hegde, Aditya Balu, and Baskar Ganapathysubramanian. 2025. MPFBench: A Large Scale Dataset for SciML of Multi-Phase-Flows: Droplet and Bubble Dynamics.arXiv preprint arXiv:2502.07080(2025)

work page arXiv 2025

[38] [38]

Noam Shazeer, *Azalia Mirhoseini, *Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. 2017. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. InInternational Conference on Learning Representations. https://openreview.net/forum?id=B1ckMDqlg

2017

[39] [39]

Mark Sussman and Emad Fatemi. 1999. An Efficient, Interface-Preserving Level Set Redistancing Algorithm and Its Application to Interfacial Incompressible Fluid Flow.SIAM Journal on Scientific Computing20, 4 (1999), 1165–1191. arXiv:https://doi.org/10.1137/S1064827596298245 doi:10.1137/S1064827596298245

work page doi:10.1137/s1064827596298245 1999

[40] [40]

Alasdair Tran, Alexander Mathews, Lexing Xie, and Cheng Soon Ong. 2023. Factorized Fourier Neural Operators. InThe Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=tmIiMPl4IPa

2023

[41] [41]

Hong Wang, Haiyang Xin, Jie Wang, Xuanze Yang, Fei Zha, huanshuo dong, and Yan Jiang. 2025. Mixture-of-Experts Operator Transformer for Large-Scale PDE Pre-Training. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems. https://openreview.net/forum?id=PNgG4H3q9D

2025

[42] [42]

Peihao Wang, Wenqing Zheng, Tianlong Chen, and Zhangyang Wang. 2022. Anti- Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice. InInternational Conference on Learning Representations. https://openreview.net/forum?id=O476oWmiNNp

2022

[43] [43]

Zhenzhong Wang, Xin Zhang, Jun Liao, and Min Jiang. 2025. Cross-Field Interface-Aware Neural Operators for Multiphase Flow Simulation.arXiv preprint arXiv:2511.08625(2025)

work page arXiv 2025

[44] [44]

Haixu Wu, Huakun Luo, Haowen Wang, Jianmin Wang, and Mingsheng Long

[45] [45]

In Forty-first International Conference on Machine Learning

Transolver: A Fast Transformer Solver for PDEs on General Geometries. In Forty-first International Conference on Machine Learning

[46] [46]

1959.Hydrodynamic aspects of boiling heat transfer

Novak Zuber. 1959.Hydrodynamic aspects of boiling heat transfer. Number 4439. United States Atomic Energy Commission, Technical Information Service. NUCLEUS-MoE: Unified Model of Pool Boiling for Liquid Cooling A BubbleML Dataset To test the generalization ofNUCLEUSto physical scenarios not seen during pretraining, we generate out of distribution datasets...

work page arXiv 1959