pith. sign in

arxiv: 2605.23037 · v1 · pith:KHMXKBEAnew · submitted 2026-05-21 · 💻 cs.LG · physics.flu-dyn

Open Multimodal Datasets and Open-Source Software for Data-Driven Modeling of Multiphase Transport and Thermal Systems

Pith reviewed 2026-05-25 05:32 UTC · model grok-4.3

classification 💻 cs.LG physics.flu-dyn
keywords multimodal datasetsopen-source softwaremultiphase transportthermal systemsS+TD frameworkdata-driven modelingAI-enabled researchthermal-fluid digital twins
0
0 comments X

The pith

Open multimodal datasets and software packages support reproducible AI modeling of multiphase transport and thermal systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Data-driven approaches to multiphase transport, electronics cooling, and thermal-fluid digital twins are slowed by datasets that remain fragmented and difficult to decode or benchmark across studies. The paper releases an open collection of NED3 datasets covering boiling images, high-speed videos, infrared thermography, acoustic measurements, and CFD fields, all organized under a new S+TD classification that tags each by its spatial and temporal dimensionality. Complementary open-source packages such as BubbleID, SeqReg, and CFDTwin are described for tasks including computer vision on bubble dynamics and sequence regression for heat-flux estimation. The work positions these resources as a foundation for community-wide reuse and the eventual construction of interoperable thermal-fluid databanks that link raw data to physically interpretable models. A sympathetic reader would see the releases as a concrete step toward reducing duplication of effort in AI-enabled thermal research.

Core claim

The paper presents an open ecosystem consisting of multimodal NED3 datasets classified by the S+TD framework (0+0D point values through 3+0D volumetric fields and multimodal combinations) together with open-source software packages including BubbleID, SeqReg, CFDTwin, IRISApp, decode-wfs, AELab, and FlowLab; these resources are offered to enable reproducible, AI-enabled modeling of multiphase transport, thermal systems, and related diagnostics.

What carries the argument

The S+TD (spatial-plus-temporal dimensionality) framework that classifies datasets according to the dimensionality of measured or simulated fields, such as 2+1D videos or 3+0D volumetric fields.

If this is right

  • Standardized multimodal datasets become available for direct comparison of AI models on boiling, acoustic, and thermographic tasks.
  • SeqReg enables nonintrusive estimation of heat flux from 0+1D, 1+1D, or 2+1D time-series or video data.
  • CFD-generated fields and design files can be paired with experimental measurements inside surrogate models for digital twins.
  • Acoustic-emission and waveform-decoding tools support diagnostics that combine audio and thermal signals.
  • Future databanks can be built by linking the released metadata, decoders, and baselines into shared repositories.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The S+TD labels could be adopted by other laboratories working with time-resolved imaging or sensor arrays, creating a de-facto standard for dimensionality tagging.
  • Integration of the software packages with existing physics-informed neural-network frameworks would allow direct testing of whether the open data improve model generalization.
  • Industrial users might test the packages on proprietary thermal systems to check whether the NED3 baselines transfer beyond laboratory conditions.
  • The emphasis on waveform decoding and acoustic analysis points toward possible cross-application in non-destructive testing or structural-health monitoring outside thermal fluids.

Load-bearing premise

The released datasets, the S+TD classification, and the listed software packages will prove sufficient to overcome fragmentation and enable community-wide benchmarking and reuse.

What would settle it

Subsequent published studies in multiphase transport or thermal-fluid AI that continue to rely on private or non-interoperable datasets rather than the NED3 releases, or that report no use of the S+TD labels or the listed software packages.

Figures

Figures reproduced from arXiv: 2605.23037 by Annapurna Parjuli, Braden Stevens, Chinmaya Joshi, Christy Dunlap, Daniel Curl, Han Hu, Hari Pandey, Mohammad Ishraq Hossain, Stephen Pierson.

Figure 1
Figure 1. Figure 1: Two-phase transport processes and associated research topics. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Multiscale nature of thermal transport processes spanning molecular, interfacial, compo [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Representative multidimensional static and transient ”S+TD” thermal fluids datasets. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Categories of open-source data in the NED3 ecosystem showing that open thermal-fluid [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: SeqReg workflow from loading data to preparing sequences, defining or loading models, [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Multidimensional sequential data used by SeqReq showing a data-format strategy for 1D, [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Data curation and physics metadata for a two-phase cooling dataset. (a) Workflow [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗
read the original abstract

Data-driven modeling is becoming central to multiphase transport, electronics cooling, acoustic diagnostics, and thermal-fluid digital twins, but progress is limited by fragmented datasets and raw instrument files that are difficult to decode, reuse, or benchmark. This paper presents an open ecosystem of multimodal datasets and open-source software packages developed by the Nano Energy and Data-Driven Discovery (NED3) Laboratory for reproducible AI-enabled thermal-fluid research. We introduce a spatial-plus-temporal dimensionality framework, denoted S+TD, to classify datasets by the dimensionality of measured or simulated fields, including 0+0D point values, 0+1D time series, 1+0D profiles, 2+0D images, 2+1D videos, 3+0D volumetric fields, and multimodal combinations. We organize public NED3 datasets spanning boiling images, acoustic and thermal measurements, high-speed videos, infrared thermography, thermal-resistance measurements, CFD-generated fields, design files, and acoustic-emission data. We also describe complementary software packages, including BubbleID, SeqReg, CFDTwin, IRISApp, decode-wfs, AELab, and FlowLab, which support computer vision, sequence regression, surrogate modeling, infrared analysis, waveform decoding, acoustic-emission analysis, and multimodal diagnostics. Particular emphasis is placed on SeqReg, a general sequence-regression library for 0+1D, 1+1D, and 2+1D data, with applications such as nonintrusive heat-flux estimation. Finally, we discuss future community efforts to build interoperable thermal-fluid databanks and curated AI/ML tool libraries that connect datasets, metadata, decoders, baselines, benchmarks, and physically interpretable models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript announces the release of an open ecosystem of multimodal datasets and open-source software packages developed by the NED3 Laboratory to support reproducible, AI-enabled research in multiphase transport, thermal-fluid systems, and related areas. It introduces the S+TD classification framework (0+0D through 3+0D and multimodal) to organize datasets spanning boiling images, acoustic/thermal measurements, high-speed videos, infrared thermography, CFD fields, and acoustic-emission data. Complementary tools including BubbleID, SeqReg (with emphasis on sequence regression for heat-flux estimation), CFDTwin, IRISApp, decode-wfs, AELab, and FlowLab are described, along with a forward-looking discussion of community databanks and benchmarks.

Significance. If the resources are publicly accessible, documented, and maintained as stated, the work supplies concrete, reusable assets that directly address dataset fragmentation in thermal-fluid modeling. The S+TD taxonomy and SeqReg library offer immediate utility for computer-vision and time-series tasks; the overall release lowers barriers to benchmarking and could accelerate physically interpretable ML models in the domain.

minor comments (2)
  1. Abstract: the statement that the resources 'will be sufficient to overcome fragmentation' is forward-looking; a brief qualifier noting that community adoption remains to be demonstrated would align the abstract more closely with the descriptive nature of the contribution.
  2. Section describing SeqReg: the applications (e.g., nonintrusive heat-flux estimation) would benefit from one or two concrete performance metrics or baseline comparisons to illustrate the library's practical value beyond its general design.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript and for recommending acceptance. The review accurately captures the contributions of the S+TD framework, the multimodal datasets, and the open-source tools, particularly SeqReg.

Circularity Check

0 steps flagged

No circularity; purely descriptive resource announcement with no derivations or predictions

full rationale

The paper contains no derivation chain, equations, fitted parameters, or predictions. Its central claim is the release and organization of NED3 datasets, the S+TD classification framework (introduced as a descriptive taxonomy), and listed software packages. No step reduces to self-definition, fitted inputs renamed as predictions, or load-bearing self-citations. The forward-looking aspiration about overcoming fragmentation is not a load-bearing premise. This matches the default expectation of no circularity for non-derivational papers.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This paper is a description of released datasets and software with no new physical models, derivations, or fitted parameters.

pith-pipeline@v0.9.0 · 5884 in / 1070 out tokens · 32604 ms · 2026-05-25T05:32:23.791902+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

  1. [1]

    NED3 Laboratory, ”Open-Source Software Packages,” Nano Energy and Data-Driven Discov- ery Laboratory, University of Arkansas.https://ned3.uark.edu/software/

  2. [2]

    NED3 Laboratory, ”Open-Source Datasets,” Nano Energy and Data-Driven Discovery Labo- ratory, University of Arkansas.https://ned3.uark.edu/datasets/ 22

  3. [3]

    Y. Suh, A. Chandramowlishwaran, and Y. Won, ”Recent progress of artificial intelligence for liquid-vapor phase change heat transfer,” npj Computational Materials, vol. 10, 65, 2024. https://doi.org/10.1038/s41524-024-01223-8

  4. [4]

    Ravichandran and M

    M. Ravichandran and M. Bucci, ”Online, quasi-real-time analysis of high-resolution, infrared, boiling heat transfer investigations using artificial neural networks,” Applied Thermal Engi- neering, vol. 163, 114357, 2019.https://doi.org/10.1016/j.applthermaleng.2019.114357

  5. [5]

    Ravichandran, G

    M. Ravichandran, G. Su, C. Wang, J. H. Seong, A. Kossolapov, B. Phillips, M. M. Rahman, and M. Bucci, ”Decrypting the boiling crisis through data-driven exploration of high-resolution infrared thermometry measurements,” Applied Physics Letters, vol. 118, 253903, 2021.https: //doi.org/10.1063/5.0048391

  6. [6]

    MIT News, ”Infrared cameras and artificial intelligence provide in- sight into boiling,” July 7, 2021.https://news.mit.edu/2021/ infrared-cameras-artificial-intelligence-provide-insight-into-boiling-0707

  7. [7]

    Zhang, C

    L. Zhang, C. Wang, G. Su, A. Kossolapov, G. M. Aguiar, J. H. Seong, F. Chavagnat, B. Phillips, M. M. Rahman, and M. Bucci, ”A unifying criterion of the boiling crisis,” Nature Communications, vol. 14, 2321, 2023.https://doi.org/10.1038/s41467-023-37899-7

  8. [8]

    Ravichandran, A

    M. Ravichandran, A. Kossolapov, G. M. Aguiar, B. Phillips, and M. Bucci, ”Autonomous and online detection of dry areas on a boiling surface using deep learning and infrared thermometry,” Experimental Thermal and Fluid Science, vol. 145, 110879, 2023.https: //doi.org/10.1016/j.expthermflusci.2023.110879

  9. [9]

    G. M. Hobold and A. K. da Silva, ”Machine learning classification of boiling regimes with low speed, direct and indirect visualization,” International Journal of Heat and Mass Transfer, vol. 125, pp. 1296-1309, 2018.https://doi.org/10.1016/j.ijheatmasstransfer.2018.04.156

  10. [10]

    G. M. Hobold and A. K. da Silva, ”Visualization-based nucleate boiling heat flux quantification using machine learning,” International Journal of Heat and Mass Transfer, vol. 134, pp. 511- 520, 2019.https://doi.org/10.1016/j.ijheatmasstransfer.2018.12.170

  11. [11]

    Y. Suh, R. Bostanabad, and Y. Won, ”Deep learning predicts boiling heat transfer,” Scientific Reports, vol. 11, 5622, 2021.https://doi.org/10.1038/s41598-021-85150-4

  12. [12]

    Huang, S

    C.-N. Huang, S. Chang, Y. Suh, I. Mudawar, Y. Won, and C. R. Kharangate, ”Machine learning boiling prediction: From autonomous vision of flow visualization data to performance parameter theoretical modeling,” International Journal of Multiphase Flow, vol. 179, 104928, 2024.https://doi.org/10.1016/j.ijmultiphaseflow.2024.104928

  13. [13]

    D. Garg, A. Bard, Y. Qiu, C. R. Kharangate, and R. H. French, ”Machine learning algorithms to predict flow boiling pressure drop in mini/micro-channels based on universal consolidated data,” International Journal of Heat and Mass Transfer, vol. 178, 121607, 2021.https://doi. org/10.1016/j.ijheatmasstransfer.2021.121607

  14. [14]

    A. Bard, Y. Qiu, C. R. Kharangate, and R. H. French, ”Consolidated modeling and prediction of heat transfer coefficients for saturated flow boiling in mini/micro-channels using machine learning methods,” Applied Thermal Engineering, 118305, 2022.https://doi.org/10.1016/ j.applthermaleng.2022.118305 23

  15. [15]

    ywflow, ”BubMask: Mask R-CNN for Bubble mask extraction,” GitHub repository.https: //github.com/ywflow/BubMask

  16. [16]

    Kim and H

    Y. Kim and H. Park, ”Deep learning-based automated and universal bubble detection and mask extraction in complex two-phase flows,” Scientific Reports, vol. 11, 8940, 2021.https: //doi.org/10.1038/s41598-021-88334-0

  17. [17]

    Dunlap, C

    C. Dunlap, C. Li, H. Pandey, N. Le, and H. Hu, ”BubbleID: A Deep Learning Framework for Bubble Interface Dynamics Analysis,” Journal of Applied Physics, vol. 136, 014902, 2024

  18. [18]

    Dunlap and contributors, ”BubbleID,” GitHub repository.https://github.com/ cldunlap73/BubbleID

    C. Dunlap and contributors, ”BubbleID,” GitHub repository.https://github.com/ cldunlap73/BubbleID

  19. [19]

    Dunlap, C

    C. Dunlap, C. Li, H. Pandey, N. Le, and H. Hu, ”Data from: BubbleID: A deep learning framework for bubble interface dynamics analysis,” Dryad, 2025.https://doi.org/10.5061/ dryad.ksn02v7gx

  20. [20]

    Gao, H.-X

    Y. Gao, H.-X. Yu, B. Zhu, and J. Wu, ”FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025.https://github.com/ueoo/FluidNexus

  21. [21]

    Y. Suh, S. Chang, P. Simadiris, T. B. Inouye, M. J. Hoque, S. Khodakarami, C. R. Kharangate, N. Miljkovic, and Y. Won, ”VISION-iT: A Framework for Digitizing Bubbles and Droplets,” Energy and AI, vol. 15, 100309, 2024.https://doi.org/10.1016/j.egyai.2023.100309

  22. [22]

    Q. Fu, Y. Suh, X. Zhang, S. Chang, and Y. Won, ”Bubble2Heat: Optical to Thermal Inference in Pool Boiling Using Physics-encoded Generative AI,” arXiv:2505.00823, 2025.https:// arxiv.org/abs/2505.00823

  23. [23]

    Chang, S

    S. Chang, S. Arani, N. S. Nuthalapati, Y. Suh, N. Choi, S. Khodakarami, M. R. H. Roni, N. Miljkovic, A. Chandramowlishwaran, and Y. Won, ”EventFlow: Real-time neuromorphic event-driven classification of two-phase boiling flow regimes,” Droplet, 2026.https://doi. org/10.1002/dro2.70066

  24. [24]

    K. N. R. Sinha, V. Kumar, N. Kumar, A. Thakur, and R. Raj, ”Deep learning the sound of boiling for advance prediction of boiling crisis,” Cell Reports Physical Science, vol. 2, 100382, 2021.https://doi.org/10.1016/j.xcrp.2021.100382

  25. [25]

    Raj, ”Dataset: Deep learning the sound of boiling for advance prediction of boiling crisis,” Mendeley Data, Version 1, 2021.https://doi.org/10.17632/vnnzc7n97m.1

    R. Raj, ”Dataset: Deep learning the sound of boiling for advance prediction of boiling crisis,” Mendeley Data, Version 1, 2021.https://doi.org/10.17632/vnnzc7n97m.1

  26. [26]

    K. N. R. Sinha, V. Kumar, N. Kumar, A. Thakur, and R. Raj, ”Dataset for boiling acoustic emissions: A tool for data driven boiling regime prediction,” Data in Brief, vol. 52, 109793, 2024.https://doi.org/10.1016/j.dib.2023.109793

  27. [27]

    Suriyaprasaad, A

    B. Suriyaprasaad, A. Upadhyay, A. Thakur, and R. Raj, ”A roadmap for decoding the sound of boiling,” npj Thermal Science and Engineering, vol. 1, 2, 2026.https://doi.org/10.1038/ s44435-025-00004-z

  28. [28]

    Dunlap, H

    C. Dunlap, H. Pandey, E. Weems, and H. Hu, ”Data for: Nonintrusive heat flux quantifica- tion using acoustic emissions during pool boiling,” Dryad, 2025.https://doi.org/10.5061/ dryad.q573n5tvq 24

  29. [29]

    Dunlap, C

    C. Dunlap, C. Li, H. Pandey, and H. Hu, ”Data for: Hit2Flux: A machine learning framework for boiling heat flux prediction using hit-based acoustic emission sensing,” Dryad, 2025.https: //doi.org/10.5061/dryad.g79cnp628

  30. [30]

    S. M. S. Hassan, A. Feeney, A. Dhruv, J. Kim, Y. Suh, J. Ryu, Y. Won, and A. Chan- dramowlishwaran, ”BubbleML: A Multi-Physics Dataset and Benchmarks for Machine Learn- ing,” arXiv:2307.14623, 2023.https://doi.org/10.48550/arXiv.2307.14623

  31. [31]

    HPCForge, ”BubbleML: A multiphase multiphysics dataset and benchmarks for scientific ma- chine learning,” GitHub repository.https://github.com/HPCForge/BubbleML

  32. [32]

    H. Zhai, Q. Zhou, and G. Hu, ”Predicting micro-bubble dynamics with semi-physics-informed deep learning,” AIP Advances, vol. 12, 035153, 2022.https://doi.org/10.1063/5.0079602

  33. [33]

    comp-physics, ”bubble-dynamics-resnet: Integrate bubble dynamics faster,” GitHub reposi- tory.https://github.com/comp-physics/bubble-dynamics-resnet

  34. [34]

    https://github.com/BerryWei/Bubble-dynamic

    BerryWei, ”Bubble-dynamic: Rayleigh Plesset & Keller Miksis equation,” GitHub repository. https://github.com/BerryWei/Bubble-dynamic

  35. [35]

    kozakaron, ”Bubble dynamics simulation,” GitHub repository.https://github.com/ kozakaron/Bubble_dynamics_simulation

  36. [36]

    J. E. Taylor, ”bubblemask: Python package for applying Gaussian Bubbles masks to image stimuli,” GitHub repository.https://github.com/JackEdTaylor/bubblemask

  37. [37]

    Pandey, C

    H. Pandey, C. Li, and H. Hu, ”Multimodal Boiling Dataset with Synchronized Acoustic, Op- tical, and Thermal Measurements Under Steady-State and Transient Heat Loads,” Data in Brief, 55, 110582, 2024

  38. [38]

    Pandey and H

    H. Pandey and H. Hu, ”Pool Boiling Dataset: Synchronized Acoustic, Optical, and Thermal Measurements for Steady-State and Transient Heat Loads,” Harvard Dataverse, 2024.https: //doi.org/10.7910/DVN/6GLGC6

  39. [39]

    Dunlap, C

    C. Dunlap, C. Li, H. Pandey, Y. Sun, and H. Hu, ”Data for: A temporal-spatial framework for efficient heat flux monitoring of transient boiling,” Dryad, 2025.https://doi.org/10.5061/ dryad.6m905qgc7

  40. [40]

    Pandey, C

    H. Pandey, C. Li, C. Dunlap, and H. Hu, ”Data from: Unveiling hysteresis of transient boiling: A multimodal perspective,” Dryad, 2025.https://doi.org/10.5061/dryad.ksn02v7h2

  41. [41]

    Dunlap, H

    C. Dunlap, H. Pandey, E. Weems, and H. Hu, ”Nonintrusive Heat Flux Quantification Using Acoustic Emissions During Pool Boiling,” Applied Thermal Engineering, 228, 120558, 2023. https://doi.org/10.1016/j.applthermaleng.2023.120558

  42. [42]

    Dunlap, C

    C. Dunlap, C. Li, H. Pandey, Y. Sun, and H. Hu, ”A Temporal-Spatial Framework for Effi- cient Heat Flux Monitoring of Transient Boiling,” IEEE Transactions on Instrumentation and Measurement, 2024.https://doi.org/10.1109/TIM.2024.3460944

  43. [43]

    Dunlap, C

    C. Dunlap, C. Li, H. Pandey, and H. Hu, ”Hit2Flux: A Machine Learning Framework for Boiling Heat Flux Prediction Using Hit-Based Acoustic Emission Sensing,” AI Thermal Fluids, 1, 100002, 2025.https://doi.org/10.1016/j.aitf.2025.100002 25

  44. [44]

    Pandey, C

    H. Pandey, C. Li, C. Dunlap, and H. Hu, ”Unveiling Hysteresis of Transient Boiling: A Multimodal Perspective,” Applied Thermal Engineering, 262, 125259, 2025

  45. [45]

    Curl and H

    D. Curl and H. Hu, ”Physically Interpretable Surrogate Modeling of Thermal Fields in Elec- tronics Cooling Using Combined Proper Orthogonal Decomposition and Neural Networks,” AI Thermal Fluids, 2026

  46. [46]

    Pandey, S

    H. Pandey, S. Pierson, C. Li, C. Dunlap, and H. Hu, ”Data from: Two-phase immersion cooler for medium-voltage silicon carbide MOSFETs,” Dryad pre-publication sharing record, 2025. https://doi.org/10.5061/dryad.k98sf7mk9

  47. [47]

    Dunlap and contributors, ”SeqReg,” GitHub repository.https://github.com/ cldunlap73/SeqReg

    C. Dunlap and contributors, ”SeqReg,” GitHub repository.https://github.com/ cldunlap73/SeqReg

  48. [48]

    com/UARK-NED3/Data-Center-Cooling-Research-Tools

    UARK-NED3, ”Data-Center-Cooling-Research-Tools,” GitHub repository.https://github. com/UARK-NED3/Data-Center-Cooling-Research-Tools

  49. [49]

    MultiphaseHUB, ”Datahub for Multiphase Transport.”https://www.multiphasehub.org/

  50. [50]

    com/UARK-NED3/Bubble-Dynamics-Research-Tools 26

    UARK-NED3, ”Bubble-Dynamics-Research-Tools,” GitHub repository.https://github. com/UARK-NED3/Bubble-Dynamics-Research-Tools 26