A Comparative Study of Deep Learning Models for Geological Carbon Sequestration
Pith reviewed 2026-06-27 20:29 UTC · model grok-4.3
The pith
Deep learning surrogate accuracy for carbon sequestration varies strongly with whether the target field follows a hyperbolic or elliptic PDE.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In a controlled benchmark of U-Net, V-Net, temporal convolutional networks, Fourier neural operators, and U-FNO on the 2D CO2 injection problem, surrogate performance is strongly dependent on the underlying PDE type, with U-FNO achieving the highest accuracy for predicting CO2 saturation fields and FNO providing the best performance for pressure build-up prediction.
What carries the argument
Head-to-head comparison of U-Net, V-Net, TCN, FNO and U-FNO trained to map heterogeneous permeability-porosity fields and injection parameters to transient saturation and pressure fields.
If this is right
- For saturation prediction tasks the U-FNO architecture should be selected over the other tested models.
- For pressure build-up prediction the plain FNO should be selected over the other tested models.
- Architecture selection for coupled flow problems must account for the hyperbolic versus elliptic character of each field.
- Memory and training-time differences among the models become decision criteria once accuracy rankings are known.
Where Pith is reading between the lines
- If the 2D ranking holds in 3D, digital-twin workflows for carbon storage could route saturation and pressure queries to different specialized surrogates rather than a single model.
- The observed PDE-type dependence suggests that future operator-learning work should test whether separate Fourier or convolutional branches for hyperbolic and elliptic components improve accuracy on fully coupled multi-phase problems.
- History-matching and optimization loops that alternate between saturation and pressure updates could achieve lower overall error by switching architectures mid-loop.
Load-bearing premise
The 2D single-wellbore injection problem with anisotropic heterogeneous permeability and porosity is representative of the high-dimensional transient subsurface flow problems encountered in real geological carbon sequestration.
What would settle it
Repeating the identical architecture comparison on a 3D heterogeneous reservoir model or on a multi-well injection scenario and checking whether U-FNO and FNO retain their respective top rankings for saturation and pressure.
Figures
read the original abstract
Numerical reservoir simulations are extremely computationally expensive, as they require the repeated solution of large nonlinear algebraic systems derived from the discretized governing equations. With growing demand for real-time optimization, uncertainty quantification, and history matching in digital twin applications, reducing computational cost has become essential. Deep learning (DL)--based surrogate models have emerged as an effective approach for accelerating subsurface flow simulations. Here, we seek to determine which DL architectures are best suited for high-dimensional, transient subsurface flow problems. In this study, we examine the advantages and relative costs associated with training such models, including memory requirements, training speed, accuracy, robustness, and generalization. We conduct a comparative study of several DL architectures commonly used as surrogate models for subsurface flow problems, including U-Net, V-Net, Temporal Convolutional Networks, Fourier Neural Operators (FNO), and a U-Net--enhanced FNO (U-FNO). As a benchmark, we compare the performance of the studied models for geological carbon sequestration to predict transient pressure build-up and CO$_2$ saturation fields. We study the problem of CO$_2$ injection into a single wellbore in a two-dimensional domain, which is parameterized by anisotropic, heterogeneous permeability and porosity fields, injection configurations, and reservoir properties. Results demonstrate that surrogate model performance is strongly dependent on the underlying PDE type (i.e., hyperbolic vs. elliptic). The U-FNO achieves the highest accuracy for predicting CO$_2$ saturation fields, while the FNO provides the best performance for pressure build-up prediction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript performs a comparative evaluation of deep learning surrogate models (U-Net, V-Net, Temporal Convolutional Networks, FNO, and U-FNO) for accelerating numerical simulations of geological carbon sequestration. Using a 2D single-wellbore CO₂ injection benchmark with heterogeneous anisotropic permeability/porosity fields, the study reports relative performance in accuracy, training speed, memory use, robustness, and generalization for predicting transient pressure build-up (elliptic) and CO₂ saturation (hyperbolic) fields, concluding that model ranking depends strongly on PDE type with U-FNO best for saturation and FNO best for pressure.
Significance. If the empirical rankings prove robust under proper statistical validation and controlled isolation of factors, the work would offer practical guidance for selecting operator-learning architectures in subsurface flow surrogates, directly supporting real-time optimization and uncertainty quantification in GCS digital twins. The explicit inclusion of computational costs (memory, training time) alongside accuracy is a positive feature that strengthens applicability.
major comments (2)
- [Abstract] Abstract: The central claim that 'surrogate model performance is strongly dependent on the underlying PDE type (i.e., hyperbolic vs. elliptic)' is not load-bearing supported by the described experiments. Both fields are generated by the same coupled multiphase system; no ablation holds field statistics fixed while varying only the operator class (e.g., pure Darcy vs. pure advection problems), so the observed U-FNO vs. FNO ranking could equally reflect output regularity, discontinuity handling, or loss weighting rather than the hyperbolic/elliptic distinction.
- [Abstract] Abstract / benchmark description: The 2D single-wellbore injection problem is presented as representative of 'high-dimensional transient subsurface flow problems,' yet no scaling studies or comparisons to 3D heterogeneous cases are referenced to substantiate this; this assumption directly underpins the generalization claims for real GCS applications.
minor comments (2)
- [Abstract] Abstract: Dataset sizes, number of training realizations, error metrics (e.g., relative L2, MAE), statistical significance tests, and validation procedures are not reported, making it impossible to assess whether the stated performance differences are statistically meaningful.
- The manuscript should clarify whether the same loss weighting and hyperparameter tuning protocol was used across all architectures when comparing saturation versus pressure predictions.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our comparative study of deep learning surrogates for geological carbon sequestration. We address the two major comments below and will make revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that 'surrogate model performance is strongly dependent on the underlying PDE type (i.e., hyperbolic vs. elliptic)' is not load-bearing supported by the described experiments. Both fields are generated by the same coupled multiphase system; no ablation holds field statistics fixed while varying only the operator class (e.g., pure Darcy vs. pure advection problems), so the observed U-FNO vs. FNO ranking could equally reflect output regularity, discontinuity handling, or loss weighting rather than the hyperbolic/elliptic distinction.
Authors: We agree that the experiments are performed on the coupled multiphase flow system and do not include controlled ablations that isolate the PDE type while holding other factors fixed. The observed performance differences between pressure and saturation predictions could indeed arise from other characteristics of the output fields. We will revise the abstract and discussion sections to remove the strong causal claim of dependence on PDE type and instead report the empirical finding that U-FNO performed best on saturation while FNO performed best on pressure, with a note that further work would be needed to isolate the underlying cause. revision: yes
-
Referee: [Abstract] Abstract / benchmark description: The 2D single-wellbore injection problem is presented as representative of 'high-dimensional transient subsurface flow problems,' yet no scaling studies or comparisons to 3D heterogeneous cases are referenced to substantiate this; this assumption directly underpins the generalization claims for real GCS applications.
Authors: The study is conducted on a 2D benchmark with heterogeneous fields to enable controlled comparison of architectures under varying permeability, porosity, and injection conditions. We acknowledge that no scaling studies or 3D results are presented. We will revise the abstract and introduction to describe the setup explicitly as a 2D heterogeneous reservoir benchmark and qualify any generalization statements accordingly, noting extension to 3D as future work. revision: yes
Circularity Check
No circularity: empirical model comparison on benchmark task
full rationale
The paper is a pure empirical comparative study of existing DL architectures (U-Net, FNO, U-FNO, etc.) on a fixed 2D single-wellbore CO2 injection benchmark. No derivations, ansatzes, fitted parameters renamed as predictions, or self-citation chains appear in the central claims. Performance rankings are reported directly from training and testing on the same dataset; the PDE-type dependence statement is an interpretation of observed numerical results rather than a reduction to prior self-referential inputs. The analysis is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
J. Bear, A. H.-D. Cheng, Modeling Groundwater Flow and Contaminant Transport, Vol. 23, Springer, 2010
2010
-
[2]
R. Xu, D. Zhang, Forward prediction and surrogate modeling for subsurface hydrology: A review of theory-guided machine-learning approaches, Computers & Geosciences 188 (2024) 105611
2024
-
[3]
Ladubec, R
C. Ladubec, R. Gracie, Vertically averaged multi-constituent flow simulations of geological CO2 sequestration: Stabilized finite element methods and quadratic elements, Mathematics and Computers in Simulation 235 (2025) 114–131
2025
-
[4]
P´ artl, E
O. P´ artl, E. Meneses Rioseco, Computational framework for modeling, simulation, and optimization of geothermal energy production from naturally fractured reservoirs, Computers & Geosciences 214 (2026) 106199
2026
-
[5]
Hatefi Ardakani, R
S. Hatefi Ardakani, R. Gracie, Parameterized local reduced order model of stimulated volume evolution in reservoirs, International Journal for Numerical and Analytical Methods in Geomechanics 49 (10) (2025) 2357–2375. 31
2025
-
[6]
Akin, Mathematical modeling of steam-assisted gravity drainage, Computers & Geosciences 32 (2) (2006) 240–246
S. Akin, Mathematical modeling of steam-assisted gravity drainage, Computers & Geosciences 32 (2) (2006) 240–246
2006
-
[7]
Parchei-Esfahani, B
M. Parchei-Esfahani, B. Gee, R. Gracie, Dynamic hydraulic stimulation and fracturing from a wellbore using pressure pulsing, Engineering Fracture Mechanics 235 (2020) 107152
2020
-
[8]
Parchei Esfahani, R
M. Parchei Esfahani, R. Gracie, On the undrained and drained hydraulic fracture splits, International Journal for Numerical Methods in Engineering 118 (12) (2019) 741–763
2019
-
[9]
S. L. Brunton, J. N. Kutz, Data-driven science and engineering: Machine learning, dynamical systems, and control, Cambridge University Press, 2022
2022
-
[10]
L. Lu, P. Jin, G. Pang, Z. Zhang, G. E. Karniadakis, Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators, Nature Machine Intelligence 3 (3) (2021) 218–229
2021
-
[11]
ˇSpetl´ ık, J
M. ˇSpetl´ ık, J. Bˇ rezina, Convolutional surrogate for 3d discrete fracture-matrix tensor upscaling, Computers & Geosciences (2026) 106105
2026
-
[12]
M. L. Taccari, J. Nuttall, X. Chen, H. Wang, B. Minnema, P. K. Jimack, Attention U-Net as a surrogate model for groundwater prediction, Advances in Water Resources 163 (2022) 104169
2022
-
[13]
Hajisharifi, R
A. Hajisharifi, R. Halder, M. Girfoglio, A. Beccari, D. Bonanni, G. Rozza, An LSTM-enhanced surrogate model to simulate the dynamics of particle-laden fluid systems, Computers & Fluids 280 (2024) 106361
2024
-
[14]
Conti, M
P. Conti, M. Guo, A. Manzoni, J. S. Hesthaven, Multi-fidelity Surrogate Modeling Using Long Short-Term Memory Networks, Computer Methods in Applied Mechanics and Engineering 404 (2023) 115811
2023
-
[15]
X. Ju, F. P. Hamon, G. Wen, R. Kanfar, M. Araya-Polo, H. A. Tchelepi, Learning CO2 plume migration in faulted reservoirs with graph neural networks, Computers & Geosciences 193 (2024) 105711
2024
-
[16]
O. San, R. Maulik, M. Ahmed, An artificial neural network framework for reduced order modeling of transient flows, Communications in Nonlinear Science and Numerical Simulation 77 (2019) 271–287. 32
2019
-
[17]
Thuerey, K
N. Thuerey, K. Weißenow, L. Prantl, X. Hu, Deep learning methods for reynolds- averaged navier–stokes simulations of airfoil flows, AIAA Journal 58 (1) (2020) 25–36
2020
-
[18]
Maulik, B
R. Maulik, B. Lusch, P. Balaprakash, Reduced-order modeling of advection- dominated systems with recurrent neural networks and convolutional autoencoders, Physics of Fluids 33 (3) (2021)
2021
-
[19]
K. Li, J. Kou, W. Zhang, Deep neural network for unsteady aerodynamic and aeroelastic modeling across multiple mach numbers, Nonlinear Dynamics 96 (3) (2019) 2157–2177
2019
-
[20]
M. S. Jahangir, J. You, J. Quilty, A quantile-based encoder-decoder framework for multi-step ahead runoff forecasting, Journal of Hydrology 619 (2023) 129269
2023
-
[21]
Geneva, N
N. Geneva, N. Zabaras, Transformers for modeling physical systems, Neural Networks 146 (2022) 272–289
2022
-
[22]
Nguyen, R
T. Nguyen, R. Shah, H. Bansal, T. Arcomano, S. Madireddy, R. Maulik, V. Kotamarthi, I. Foster, A. Grover, Scaling transformers for skillful and reliable medium-range weather forecasting, in: ICLR 2024 Workshop on AI4DifferentialEquations in Science, 2024
2024
-
[23]
Hadizadeh, W
F. Hadizadeh, W. Mallik, R. K. Jaiman, A graph neural network surrogate model for multi-objective fluid-acoustic shape optimization, Computer Methods in Applied Mechanics and Engineering 441 (2025) 117921
2025
-
[24]
Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, A. Anandkumar, Neural operator: Graph kernel network for partial differential equations, arXiv preprint arXiv:2003.03485 (2020)
work page internal anchor Pith review Pith/arXiv arXiv 2003
-
[25]
Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, A. Anandkumar, Fourier neural operator for parametric partial differential equations, arXiv preprint arXiv:2010.08895 (2020)
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[26]
S. Mo, Y. Zhu, N. Zabaras, X. Shi, J. Wu, Deep convolutional encoder- decoder networks for uncertainty quantification of dynamic multiphase flow in heterogeneous media, Water Resources Research 55 (1) (2019) 703–728
2019
-
[27]
Zhong, A
Z. Zhong, A. Y. Sun, H. Jeong, Predicting CO2 plume migration in heterogeneous formations using conditional deep convolutional generative adversarial network, Water Resources Research 55 (7) (2019) 5830–5851. 33
2019
-
[28]
B. Yan, B. Chen, D. R. Harp, W. Jia, R. J. Pawar, A robust deep learning workflow to predict multiphase flow behavior during geological CO2 sequestration injection and post-injection periods, Journal of Hydrology 607 (2022) 127542
2022
-
[29]
Badawi, E
D. Badawi, E. Gildin, Neural operator-based proxy for reservoir simulations considering varying well settings, locations, and permeability fields, Computers & Geosciences 196 (2025) 105826
2025
-
[30]
Z. Feng, Z. Tariq, X. Shen, B. Yan, X. Tang, F. Zhang, An encoder-decoder ConvLSTM surrogate model for simulating geological CO 2 sequestration with dynamic well controls, Gas Science and Engineering 125 (2024) 205314
2024
-
[31]
Zingaro, S
G. Zingaro, S. H. Ardakani, R. Gracie, Y. Leonenko, Deep learning assisted monitoring framework for geological carbon sequestration, International Journal of Greenhouse Gas Control 144 (2025) 104372
2025
- [32]
-
[33]
G. Wen, C. Hay, S. M. Benson, CCSNet: A deep learning modeling suite for CO2 storage, Advances in Water Resources 155 (2021) 104009
2021
-
[34]
G. Wen, Z. Li, K. Azizzadenesheli, A. Anandkumar, S. M. Benson, U-FNO: An enhanced fourier neural operator-based deep-learning model for multiphase flow, Advances in Water Resources 163 (2022) 104180
2022
-
[35]
N. Remy, A. Boucher, J. Wu, Applied Geostatistics with SGeMS: A User’s Guide, Cambridge University Press, 2009
2009
-
[36]
H. Pape, C. Clauser, J. Iffland, Variation of permeability with porosity in sandstone diagenesis interpreted with a fractal pore space model, Pure and Applied Geophysics 157 (4) (2000) 603–619
2000
-
[37]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, ImageNet: A large-scale hierarchical image database, in: 2009 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2009, pp. 248–255
2009
-
[38]
Rich feature hierarchies for accurate object detection and semantic segmentation
R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, arXiv preprint arXiv:1311.2524 (2014). 34
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[39]
Very Deep Convolutional Networks for Large-Scale Image Recognition
K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556 (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[40]
Ronneberger, P
O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional networks for biomedical image segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, 2015, pp. 234–241
2015
-
[41]
Attention U-Net: Learning Where to Look for the Pancreas
O. Oktay, J. Schlemper, L. L. Folgoc, M. Lee, M. Heinrich, K. Misawa, K. Mori, S. McDonagh, N. Y. Hammerla, B. Kainz, et al., Attention U-Net: Learning where to look for the pancreas, arXiv preprint arXiv:1804.03999 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[42]
C ¸i¸ cek, A
¨O. C ¸i¸ cek, A. Abdulkadir, S. S. Lienkamp, T. Brox, O. Ronneberger, 3D U-Net: Learning dense volumetric segmentation from sparse annotation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer, 2016, pp. 424–432
2016
-
[43]
M. Tang, Y. Liu, L. J. Durlofsky, Deep-learning-based surrogate flow modeling and geological parameterization for data assimilation in 3D subsurface flow, Computer Methods in Applied Mechanics and Engineering 376 (2021) 113636
2021
-
[44]
Milletari, N
F. Milletari, N. Navab, S.-A. Ahmadi, V-Net: Fully convolutional neural networks for volumetric medical image segmentation, in: 2016 Fourth International Conference on 3D Vision, IEEE, 2016, pp. 565–571
2016
-
[45]
Kovachki, Z
N. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. Stuart, A. Anandkumar, Neural operator: Learning maps between function spaces with applications to PDEs, Journal of Machine Learning Research 24 (89) (2023) 1–97. 35 Appendix A. Model Performance Results λr Metric Split T emporal CNN U-Net V-Net FNO U-FNO 0.01 L(ˆs) T raining 1.96E-01 1.8...
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.