NUCLEUS-MoE: Unified Model of Pool Boiling for Liquid Cooling
Pith reviewed 2026-06-29 18:27 UTC · model grok-4.3
The pith
A single mixture-of-experts model replaces separate surrogates for pool boiling across dielectrics, refrigerants, and cryogens.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
NUCLEUS is a mixture-of-experts surrogate that unifies saturated and subcooled boiling prediction across dielectrics, refrigerants, and cryogens. It employs neighborhood attention and signed distance field reinitialization for interface consistency, with expert routing that develops coherent spatial specialization without supervision. Trained on high-fidelity simulations, the model matches or exceeds baseline performance on heterogeneous configurations, preserves physical consistency, and demonstrates zero-shot and few-shot generalization to a new fluid such as Opteon 2P50.
What carries the argument
Mixture-of-experts routing combined with neighborhood attention and signed distance field reinitialization, where routing produces emergent specialization across boiling regimes.
If this is right
- One trained model covers saturated and subcooled regimes for dielectrics, refrigerants, and cryogens instead of requiring separate models.
- Expert routing develops spatial structure and regime specialization without explicit supervision.
- The architecture maintains physical consistency while matching or exceeding prior surrogates on test configurations.
- Zero-shot and few-shot transfer is possible to a new fluid developed for immersion cooling.
Where Pith is reading between the lines
- The same routing mechanism could be tested on other multiphase transport problems where dynamics change sharply with fluid properties.
- If the emergent specialization aligns with known boiling regimes, it may reduce the need for hand-crafted regime detection in future surrogates.
- Generalization results suggest the model could serve as a starting point for online adaptation when new fluids are introduced in cooling systems.
Load-bearing premise
The high-fidelity simulations used for training accurately capture the real physics of boiling for the fluids and conditions considered.
What would settle it
A direct comparison on an unseen fluid or condition where NUCLEUS produces larger errors or violates conservation laws more than a set of fluid-specific baselines.
Figures
read the original abstract
Two-phase boiling enables heat transfer rates an order of magnitude higher than single-phase cooling, but it remains difficult to model due to the strong coupling between phase change, turbulence, and transport, as well as extreme sensitivity to fluid properties and thermodynamic conditions. Existing learning-based surrogates are either condition- or fluid-specific, limiting generalization and requiring separate models. We present NUCLEUS, a mixture-of-experts model for pool boiling that replaces collections of specialized surrogates with a single architecture. NUCLEUS combines neighborhood attention, signed distance field reinitialization for interface consistency, and expert routing that exhibits emergent specialization across distinct boiling dynamics. Trained on high-fidelity simulations of pool boiling, NUCLEUS jointly models saturated and subcooled boiling across three fluid classes (dielectrics, refrigerants, and cryogens), resolving failure modes of prior models on extreme fluids. We show that expert routing exhibits coherent spatial structure and specialization without explicit supervision. Quantitatively, NUCLEUS matches or exceeds baselines while maintaining physical consistency across heterogeneous boiling configurations. We also show zero-shot and few-shot generalization capabilities on downstream tasks such as a new fluid (Opteon 2P50 developed for immersion cooling). These results demonstrate that mixture-of-experts models are a scalable pathway toward unified surrogate modeling of boiling dynamics and lay the groundwork for broader generalization across scientific ML.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces NUCLEUS-MoE, a single mixture-of-experts architecture combining neighborhood attention and signed-distance-field reinitialization to jointly model saturated and subcooled pool boiling across dielectrics, refrigerants, and cryogens. Trained exclusively on high-fidelity simulations, the model claims to match or exceed prior baselines, exhibit emergent expert specialization without supervision, preserve physical consistency, and demonstrate zero-shot/few-shot generalization to an unseen fluid (Opteon 2P50).
Significance. If the simulation-to-reality gap is closed and the reported generalization holds, the work would demonstrate that MoE routing can capture heterogeneous multiphase physics at scale without hand-crafted per-fluid models, offering a concrete path toward unified surrogates in thermal-fluid engineering.
major comments (2)
- [Methods (training data generation) and Results (quantitative evaluation)] The central performance and generalization claims rest on the premise that the high-fidelity pool-boiling simulations accurately capture coupled phase-change/turbulence physics for cryogens. No section provides direct quantitative comparison of simulated heat-transfer coefficients or bubble statistics against experimental measurements for any cryogen in the high-heat-flux or low-temperature regime; without such validation the reported gains over baselines and the emergent specialization could be artifacts of the synthetic data distribution.
- [Results (expert routing)] § on expert routing analysis: the claim that routing exhibits 'coherent spatial structure and specialization' corresponding to distinct boiling dynamics is presented qualitatively. No quantitative metric (e.g., mutual information between router logits and local heat-flux regime labels, or ablation showing performance drop when routing is randomized) is supplied to demonstrate that the specialization is functionally meaningful rather than incidental.
minor comments (2)
- [Model architecture] Notation for the signed-distance-field reinitialization step should be made explicit (e.g., the precise reinitialization equation and frequency) so that reproducibility is possible from the text alone.
- [Abstract and Results] The abstract states that NUCLEUS 'resolves failure modes of prior models on extreme fluids,' but the manuscript does not tabulate the specific failure modes (e.g., divergence, unphysical negative temperatures) that each baseline exhibited on the cryogen test cases.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive comments. We address each major point below and indicate where revisions will be made to the manuscript.
read point-by-point responses
-
Referee: [Methods (training data generation) and Results (quantitative evaluation)] The central performance and generalization claims rest on the premise that the high-fidelity pool-boiling simulations accurately capture coupled phase-change/turbulence physics for cryogens. No section provides direct quantitative comparison of simulated heat-transfer coefficients or bubble statistics against experimental measurements for any cryogen in the high-heat-flux or low-temperature regime; without such validation the reported gains over baselines and the emergent specialization could be artifacts of the synthetic data distribution.
Authors: We agree that the manuscript does not contain direct quantitative comparisons between the high-fidelity simulations and experimental measurements specifically for cryogens in the high-heat-flux or low-temperature regimes. The underlying solver is drawn from established numerical methods whose validation against experiments is documented in the cited prior literature for a range of fluids and conditions; however, we acknowledge that this does not constitute new, direct validation within the present work for the cryogen cases at the extremes of the parameter space. We will revise the Methods section to include an expanded discussion of the simulation validation status, explicitly note the simulation-to-experiment gap as a limitation, and clarify that all performance and generalization claims are made within the simulation domain. These changes will be reflected in the revised manuscript. revision: yes
-
Referee: [Results (expert routing)] § on expert routing analysis: the claim that routing exhibits 'coherent spatial structure and specialization' corresponding to distinct boiling dynamics is presented qualitatively. No quantitative metric (e.g., mutual information between router logits and local heat-flux regime labels, or ablation showing performance drop when routing is randomized) is supplied to demonstrate that the specialization is functionally meaningful rather than incidental.
Authors: The expert routing analysis in the current manuscript relies on qualitative visualization of routing patterns and their spatial correspondence to boiling regimes. We will strengthen this section by adding quantitative metrics, including mutual information between router logits and local heat-flux regime labels derived from the simulation data, as well as an ablation experiment in which routing is replaced by random assignment to quantify the resulting performance drop. These additions will be included in the revised Results section on expert routing. revision: yes
Circularity Check
No circularity; empirical ML training and generalization claims are self-contained
full rationale
The paper describes an MoE architecture trained on high-fidelity pool-boiling simulations, with performance claims resting on quantitative matches to baselines, physical consistency checks, and zero/few-shot tests on a held-out fluid (Opteon 2P50). No equations, uniqueness theorems, or fitted-parameter renamings are presented that would reduce any reported prediction to the training inputs by construction. No self-citations are invoked as load-bearing premises for the architecture or results. The derivation chain is therefore the standard supervised-learning pipeline (train on sims, evaluate on generalization tasks) and remains externally falsifiable.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Mohammad Azarifar, Mehmet Arik, and Je-Young Chang. 2024. Liquid cooling of data centers: A necessity facing challenges.Applied Thermal Engineering247 (2024), 123112
2024
-
[2]
Kamyar Azizzadenesheli, Nikola Kovachki, Zongyi Li, Miguel Liu-Schiaffini, Jean Kossaifi, and Anima Anandkumar. 2024. Neural operators for accelerating scientific simulations and design.Nature Reviews Physics6, 5 (2024), 320–328
2024
-
[3]
Steven L Brunton, Bernd R Noack, and Petros Koumoutsakos. 2020. Machine learning for fluid mechanics.Annual review of fluid mechanics52, 1 (2020), 477–508
2020
-
[4]
Matteo Bucci, Andrew Richenderfer, Guan-Yu Su, Thomas McKrell, and Jacopo Buongiorno. 2016. A mechanistic IR calibration technique for boiling heat transfer investigations.International Journal of Multiphase Flow83 (2016), 115–127
2016
-
[5]
Krishnapriyan
Nithin Chalapathi, Yiheng Du, and Aditi S. Krishnapriyan. 2024. Scaling physics- informed hard constraints with mixture-of-experts. InThe Twelfth International Conference on Learning Representations. https://openreview.net/forum?id= u3dX2CEIZb
2024
-
[6]
Damai Dai, Chengqi Deng, Chenggang Zhao, RX Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Yu Wu, et al. 2024. Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models.arXiv preprint arXiv:2401.06066(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[7]
Vijay K Dhir, Gopinath R Warrier, and Eduardo Aktinol. 2013. Numerical simula- tion of pool boiling: a review.Journal of Heat Transfer135, 6 (2013), 061502
2013
-
[8]
Akash Dhruv, Elias Balaras, Amir Riaz, and Jungho Kim. 2019. A formulation for high-fidelity simulations of pool boiling in low gravity.International Journal of Multiphase Flow120 (2019), 103099
2019
-
[9]
Jaco Dirker, Diksha Juggurnath, Alihan Kaya, Emmanuel A Osowade, Michael Simpson, Steven Lecompte, Seyyed Mohammad Ali Noori Rahim Abadi, Victor Voulgaropoulos, Adekunle O Adelaja, M Zaid Dauhoo, et al . 2019. Thermal energy processes in direct steam generation solar systems: Boiling, condensation and energy storage–A review.Frontiers in Energy Research6 ...
2019
-
[10]
Alexey Dosovitskiy. 2020. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929(2020)
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[11]
Anshu Dubey, Klaus Weide, Jared O’Neal, Akash Dhruv, Sean Couch, J. Austin Harris, Tom Klosterman, Rajeev Jain, Johann Rudi, Bronson Messer, Michael Pajkos, Jared Carlson, Ran Chu, Mohamed Wahib, Saurabh Chawdhary, Paul M. Ricker, Dongwook Lee, Katie Antypas, Katherine M. Riley, Christopher Daley, Murali Ganapathy, Francis X. Timmes, Dean M. Townsley, Mar...
-
[12]
Veronika Eyring, William D Collins, Pierre Gentine, Elizabeth A Barnes, Marcelo Barreiro, Tom Beucler, Marc Bocquet, Christopher S Bretherton, Hannah M Christensen, Katherine Dagon, et al . 2024. Pushing the frontiers in climate modelling and analysis with machine learning.Nature Climate Change14, 9 (2024), 916–928
2024
-
[13]
2019.Fundamentals of Multiphase Heat Transfer and Flow
Amir Faghri and Yuwen Zhang. 2019.Fundamentals of Multiphase Heat Transfer and Flow. Springer Nature
2019
-
[14]
William Fedus, Barret Zoph, and Noam Shazeer. 2022. Switch Transform- ers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. arXiv:2101.03961 [cs.LG] https://arxiv.org/abs/2101.03961
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[15]
1962.The mechanism of heat transfer in nucleate pool boiling
Chi-Yeh Han. 1962.The mechanism of heat transfer in nucleate pool boiling. Ph. D. Dissertation. Massachusetts Institute of Technology
1962
-
[16]
Sheikh Md Shakeel Hassan, Arthur Feeney, Akash Dhruv, Jihoon Kim, Youngjoon Suh, Jaiyoung Ryu, Yoonjin Won, and Aparna Chandramowlishwaran. 2023. Bub- bleml: A multiphase multiphysics dataset and benchmarks for machine learning. Advances in Neural Information Processing Systems36 (2023), 418–449
2023
-
[17]
Sheikh Md Shakeel Hassan, Xianwei Zou, Akash Dhruv, and Aparna Chan- dramowlishwaran. 2026. Bubbleformer: Forecasting Boiling with Transformers. Advances in Neural Information Processing Systems38 (2026)
2026
-
[18]
Ali Hassani, Steven Walton, Jiachen Li, Shen Li, and Humphrey Shi. 2023. Neigh- borhood attention transformer. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 6185–6194
2023
-
[19]
Maximilian Herde, Bogdan Raonic, Tobias Rohner, Roger Käppeli, Roberto Moli- naro, Emmanuel de Bézenac, and Siddhartha Mishra. 2024. Poseidon: Efficient foundation models for pdes.Advances in Neural Information Processing Systems 37 (2024), 72525–72624
2024
- [20]
-
[21]
Jean Kossaifi, Nikola Kovachki, Morteza Mardani, Daniel Leibovici, Suman Ravuri, Ira Shokar, Edoardo Calvello, Mohammad Shoaib Abbas, Peter Harrington, Ashay Subramaniam, et al. 2026. Demystifying Data-Driven Probabilistic Medium-Range Weather Forecasting.arXiv preprint arXiv:2601.18111(2026)
- [22]
-
[23]
2026.Data Centers and Their Energy Consumption: Frequently Asked Questions
Ashley Lawson, Martin Offutt, Natalie Ortiz, and Ling Zhu. 2026.Data Centers and Their Energy Consumption: Frequently Asked Questions. Technical Report R48646. U.S. Congress. https://www.congress.gov/crs-product/R48646
2026
-
[24]
Eric W Lemmon, Ian H Bell, ML Huber, and MO McLinden. 2018. NIST standard reference database 23: reference fluid thermodynamic and transport properties- REFPROP, Version 10.0, National Institute of Standards and Technology.Standard Reference Data Program, Gaithersburg(2018), 45–46
2018
-
[25]
Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. 2021. Fourier Neural Operator for Parametric Partial Differential Equations. arXiv:2010.08895 [cs.LG] https://arxiv.org/abs/2010.08895
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[26]
Veeling, Paris Perdikaris, Richard E
Phillip Lippe, Bastiaan S. Veeling, Paris Perdikaris, Richard E. Turner, and Jo- hannes Brandstetter. 2023. PDE-Refiner: Achieving Accurate Long Rollouts with Neural PDE Solvers. arXiv:2308.05732 [cs.LG] https://arxiv.org/abs/2308.05732
-
[27]
Yang Liu, Chengqi Wang, Yalan Qian, and Xiaodong Sun. 2020. Uncertainty analysis of PIV measurements in bubbly flows considering sampling and bubble effects with ray optics modeling.Nuclear Engineering and Design364 (2020), 110677
2020
-
[28]
Shiro Nukiyama. 1966. The maximum and minimum values of the heat Q trans- mitted from metal to boiling water under atmospheric pressure.International Journal of Heat and Mass Transfer9, 12 (1966), 1419–1433
1966
- [29]
-
[30]
2003.Level Set Methods and Dynamic Implicit Surfaces
Stanley Osher and Ronald Fedkiw. 2003.Level Set Methods and Dynamic Implicit Surfaces. Springer-Verlang New York Inc
2003
-
[31]
Ethan Perez, Florian Strub, Harm De Vries, Vincent Dumoulin, and Aaron Courville. 2018. Film: Visual reasoning with a general conditioning layer. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32
2018
-
[32]
Yoeri Poels, Koen Minartz, Harshit Bansal, and Vlado Menkovski. 2024. Acceler- ating Simulation of Two-Phase Flows with Neural PDE Surrogates. InICML 2024 AI for Science Workshop. https://openreview.net/forum?id=yIqszw9RUc
2024
-
[33]
Stephen B. Pope. 2000.Turbulent Flows. Cambridge University Press
2000
-
[34]
Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, André Susano Pinto, Daniel Keysers, and Neil Houlsby. 2021. Scaling vision with sparse mixture of experts.Advances in Neural Information Processing Systems34 (2021), 8583–8595
2021
-
[35]
Warren M Rohsenow. 1952. A method of correlating heat-transfer data for surface boiling of liquids.Transactions of the American Society of Mechanical Engineers 74, 6 (1952), 969–975
1952
-
[36]
Yohei Sato and Bojan Niceno. 2018. Pool boiling simulation using an interface tracking method: From nucleate boiling to film boiling regime through critical heat flux.International Journal of Heat and Mass Transfer125 (2018), 876–890
2018
-
[37]
Mehdi Shadkhah, Ronak Tali, Ali Rabeh, Cheng-Hau Yang, Ethan Herron, Ab- hisek Upadhyaya, Adarsh Krishnamurthy, Chinmay Hegde, Aditya Balu, and Baskar Ganapathysubramanian. 2025. MPFBench: A Large Scale Dataset for SciML of Multi-Phase-Flows: Droplet and Bubble Dynamics.arXiv preprint arXiv:2502.07080(2025)
-
[38]
Noam Shazeer, *Azalia Mirhoseini, *Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. 2017. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. InInternational Conference on Learning Representations. https://openreview.net/forum?id=B1ckMDqlg
2017
-
[39]
Mark Sussman and Emad Fatemi. 1999. An Efficient, Interface-Preserving Level Set Redistancing Algorithm and Its Application to Interfacial Incompressible Fluid Flow.SIAM Journal on Scientific Computing20, 4 (1999), 1165–1191. arXiv:https://doi.org/10.1137/S1064827596298245 doi:10.1137/S1064827596298245
-
[40]
Alasdair Tran, Alexander Mathews, Lexing Xie, and Cheng Soon Ong. 2023. Factorized Fourier Neural Operators. InThe Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=tmIiMPl4IPa
2023
-
[41]
Hong Wang, Haiyang Xin, Jie Wang, Xuanze Yang, Fei Zha, huanshuo dong, and Yan Jiang. 2025. Mixture-of-Experts Operator Transformer for Large-Scale PDE Pre-Training. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems. https://openreview.net/forum?id=PNgG4H3q9D
2025
-
[42]
Peihao Wang, Wenqing Zheng, Tianlong Chen, and Zhangyang Wang. 2022. Anti- Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice. InInternational Conference on Learning Representations. https://openreview.net/forum?id=O476oWmiNNp
2022
- [43]
-
[44]
Haixu Wu, Huakun Luo, Haowen Wang, Jianmin Wang, and Mingsheng Long
-
[45]
In Forty-first International Conference on Machine Learning
Transolver: A Fast Transformer Solver for PDEs on General Geometries. In Forty-first International Conference on Machine Learning
-
[46]
1959.Hydrodynamic aspects of boiling heat transfer
Novak Zuber. 1959.Hydrodynamic aspects of boiling heat transfer. Number 4439. United States Atomic Energy Commission, Technical Information Service. NUCLEUS-MoE: Unified Model of Pool Boiling for Liquid Cooling A BubbleML Dataset To test the generalization ofNUCLEUSto physical scenarios not seen during pretraining, we generate out of distribution datasets...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.