arxiv: 2601.17647 · v2 · submitted 2026-01-25 · 💻 cs.LG · cs.AI

Recognition: 3 theorem links

· Lean Theorem

Knowledge-Guided Time-Varying Causal Inference for Arctic Sea Ice Dynamics

Akila Sampath , Vandana Janeja , Jianwu Wang

Authors on Pith no claims yet

Pith reviewed 2026-05-16 11:09 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords causal inferencesea ice thicknesssea surface heightvariational autoencodertime-varying treatmentsclimate modelingmaximum mean discrepancytreatment effect estimation

0 comments

The pith

KGCM-VAE uses physical relationships between sea surface height and velocity to create time-varying treatments and estimates their causal effect on sea ice thickness with lower error than baselines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes KGCM-VAE to quantify the causal impact of sea surface height on sea ice thickness. It generates physically grounded, continuous treatments that can change at every time step by drawing on established links between height and surface velocity. Maximum mean discrepancy is added in the latent space to balance treated and control distributions and reduce time-varying confounding bias. On synthetic data the model records lower PEHE than state-of-the-art baselines when forecasting thickness responses to hypothetical height scenarios. A real-world case study further checks how physical parameters respond to the generated treatments.

Core claim

The central claim is that the Knowledge-Guided Causal Model Variational Autoencoder generates physically constrained time-varying continuous treatments from sea-surface-height and surface-velocity relationships, applies maximum mean discrepancy to balance distributions in latent space, and thereby produces more accurate estimates of the effect of height changes on sea ice thickness than existing methods, as measured by PEHE on synthetic data.

What carries the argument

The KGCM-VAE framework, which turns established physical relationships between sea surface height and surface velocity into time-varying continuous treatments at each time step and uses maximum mean discrepancy to balance treated and control distributions in the latent space.

If this is right

Lower PEHE on synthetic data when predicting sea ice thickness under hypothetical sea surface height scenarios.
Consistent gains from the maximum mean discrepancy term in ablation studies for treatment effect estimation.
Sensitivity of physical parameters to specific treatments revealed in the real-world case study.
Improved handling of time-varying confounding in climate data compared with standard deep-learning baselines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same physical-relationship approach to treatment generation could be tested on other climate variables where similar mechanistic links are known.
If the generated treatments remain unbiased, the method could be applied to longer observational records to produce regional projections of ice-thickness change.
Comparing the model's latent-space balancing against explicit instrumental-variable methods on the same data would test whether the MMD step is the main source of the reported error reduction.

Load-bearing premise

The established physical relationships between sea surface height and surface velocity can produce valid, unbiased time-varying continuous treatments at every time step without introducing new confounding.

What would settle it

A controlled physical simulation or independent observational record that shows the model's predicted sea ice thickness changes systematically deviate from the true responses under the same sea surface height forcing sequences.

Figures

Figures reproduced from arXiv: 2601.17647 by Akila Sampath, Jianwu Wang, Vandana Janeja.

**Figure 1.** Figure 1: The Architecture of the Knowledge-Guided Causal Model Variational Autoencoder (KGCM-VAE) integrates a balanced latent space z learned by the encoder with a knowledge-guided causal model in the decoder for robust counterfactual prediction and Individual Treatment Effect estimation. Knowledge-Guided Causal Modeling (KGCM-VAE) Framework: The KGCM-VAE framework ( [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗

**Figure 2.** Figure 2: Real-world spatial analyses of factual and counterfactual outcomes. The panels display the temporal patterns of perturbed Ocean Velocity (Top left) and the corresponding counterfactual prediction of Sea Ice Thickness (Bottom left), alongside perturbed Sea Surface Height (Top right) and the corresponding counterfactual prediction of Sea Ice Thickness (Bottom right). domain knowledge for reliable causal in… view at source ↗

read the original abstract

Quantifying the causal relationship between sea ice thickness and sea surface height (SSH) is essential for understanding the mechanisms driving polar climate change and global sea-level rise. Conventional deep learning models often struggle with treatment effect estimation in climate settings due to time-varying confounding and the lack of physical constraints. To address these challenges, we propose the Knowledge-Guided Causal Model Variational Autoencoder (KGCM-VAE) to quantify the effect of SSH on sea ice thickness. The framework leverages established physical relationships between SSH and surface velocity to generate physically grounded, time-varying continuous treatments, where each treatment value can change at every time step within a sequence. The model also incorporates Maximum Mean Discrepancy (MMD) to balance treated and control distributions in the latent space, mitigating observed confounding bias. Using synthetic data, we evaluated the model's ability to predict sea ice thickness responses under hypothetical SSH forcing scenarios, demonstrating that KGCM-VAE achieves superior PEHE compared to state-of-the-art baselines. Ablation studies further confirm that MMD consistently enhances treatment effect estimation over the base model. Additionally, we conducted a real-world case study to examine the sensitivity of physical parameters to specific treatments and to compare these findings with an existing modeling study.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

KGCM-VAE adds a knowledge-guided VAE plus MMD for time-varying causal effects of SSH on sea ice thickness, but the synthetic evaluation is built from the same physical relationships it assumes, which limits how much the PEHE gains tell us.

read the letter

The main thing here is a new framework called KGCM-VAE that generates continuous, time-varying treatments from established SSH-surface velocity physics, then uses a VAE with MMD to balance distributions and estimate effects on sea ice thickness. It reports better PEHE than baselines on synthetic data and includes an ablation plus a real-world sensitivity check against an existing modeling study. That combination of domain knowledge for treatment generation and standard balancing is the concrete step forward, and the time-varying aspect fits the climate setting better than static methods. The real-world case study is a plus for grounding, even if details are thin in the abstract. The soft spot is the evaluation design. Treatments come from the same physical relationships used to build the synthetic ground truth, so superior PEHE can appear simply because the model matches the simulator rather than recovering effects under realistic mismatch or extra confounders. Without numerical values, error bars, or clear checks on assumption violations, it's hard to judge how much the gains transfer. The central claim about hypothetical SSH forcing scenarios rests on that synthetic setup, which weakens the evidence for broader use. This is the kind of paper that belongs in a climate-ML or causal inference venue where reviewers can push on the validation. It deserves peer review because the problem is relevant and the approach is a reasonable extension, but it will need stronger out-of-distribution tests and more transparent metrics to hold up.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the Knowledge-Guided Causal Model Variational Autoencoder (KGCM-VAE) to estimate the causal effect of sea surface height (SSH) on Arctic sea ice thickness. It generates time-varying continuous treatments at each time step by leveraging established physical relationships between SSH and surface velocity, incorporates MMD to balance treated and control distributions in latent space, and reports superior PEHE on synthetic data relative to baselines, with an additional real-world case study on parameter sensitivity.

Significance. If the central claims hold under non-circular evaluation, the framework would offer a concrete way to embed physical constraints into time-varying causal models for climate applications. The combination of knowledge-guided treatment generation and MMD balancing addresses a recognized gap in handling continuous, time-varying confounding; however, the current evaluation provides no numerical PEHE values, error bars, or robustness checks, limiting immediate impact.

major comments (2)

[Synthetic data generation and evaluation] The synthetic data generation procedure (described in the methods and evaluation sections) creates treatments directly from the same SSH-surface velocity physical relationships that inform the KGCM-VAE components. This alignment risks circular validation: superior PEHE may arise because the model is consistent with the simulator rather than because it recovers unbiased effects under realistic confounding mismatch. A concrete test on data generated under violated physical assumptions is needed to support the claim for hypothetical SSH forcing scenarios.
[Abstract and §4 (Evaluation)] The abstract states that KGCM-VAE achieves superior PEHE and that ablation studies confirm MMD benefits, yet supplies no numerical values, confidence intervals, baseline identities, or dataset sizes. Without these quantities it is impossible to judge whether the reported advantage is statistically meaningful or practically large enough to support the headline claim.

minor comments (2)

[Abstract] The real-world case study is mentioned only in the abstract; the manuscript should include quantitative sensitivity results and direct comparison to the referenced existing modeling study.
[Model description] Notation for the time-varying treatment variable and the precise form of the MMD penalty should be defined explicitly in the model section to aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript to strengthen the evaluation and provide the requested quantitative details.

read point-by-point responses

Referee: [Synthetic data generation and evaluation] The synthetic data generation procedure (described in the methods and evaluation sections) creates treatments directly from the same SSH-surface velocity physical relationships that inform the KGCM-VAE components. This alignment risks circular validation: superior PEHE may arise because the model is consistent with the simulator rather than because it recovers unbiased effects under realistic confounding mismatch. A concrete test on data generated under violated physical assumptions is needed to support the claim for hypothetical SSH forcing scenarios.

Authors: We acknowledge the validity of this concern about potential circularity. The synthetic data generation intentionally uses the established physical relationships to create a controlled setting with known ground-truth causal effects, which is a common practice for validating causal models. However, to demonstrate robustness beyond perfect alignment with the simulator, we will add a new set of experiments in the revised manuscript. These will generate synthetic data under deliberately violated physical assumptions (e.g., by injecting noise into the SSH-velocity mapping or employing alternative dynamical models) and report KGCM-VAE performance on these mismatched datasets. This addition will directly test the model's behavior under realistic confounding mismatch. revision: yes
Referee: [Abstract and §4 (Evaluation)] The abstract states that KGCM-VAE achieves superior PEHE and that ablation studies confirm MMD benefits, yet supplies no numerical values, confidence intervals, baseline identities, or dataset sizes. Without these quantities it is impossible to judge whether the reported advantage is statistically meaningful or practically large enough to support the headline claim.

Authors: We agree that the absence of specific numerical results limits the interpretability of the claims. In the revised manuscript, we will update the abstract to report the actual PEHE values for KGCM-VAE and all baselines, including standard deviations or confidence intervals obtained from multiple random seeds. We will also explicitly name the baseline methods and state the sizes of the synthetic datasets used. Corresponding quantitative tables and error bars will be added to Section 4, along with the ablation results for the MMD component, to allow assessment of both statistical significance and practical effect size. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper generates synthetic treatments using established external physical relationships between SSH and surface velocity, then applies KGCM-VAE with knowledge-guided components and standard MMD balancing to estimate effects. The PEHE superiority claim is evaluated against independent baselines on data with known ground truth; no equations or steps reduce the reported predictions to the inputs by construction, nor do self-citations or ansatzes create load-bearing loops. The central result remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on domain knowledge of SSH-surface velocity physics and standard VAE/MMD components; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)

domain assumption Established physical relationships between SSH and surface velocity generate valid time-varying continuous treatments
Invoked to create treatments that change at every time step

pith-pipeline@v0.9.0 · 5519 in / 1083 out tokens · 30099 ms · 2026-05-16T11:09:37.600569+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

velocity modulation scheme ... smoothed velocity signals are dynamically amplified via a sigmoid function governed by SSH transitions ... SSH_treat = (1+βσ_t)SSH_t
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

MMD ... P(Z|T=1)≈P(Z|T=0) ... RBF kernel
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

synthetic counterfactual Y1,t = Y0,t + β·tanh(α·(T1,t−T0,t−μT)) + ϵ

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · 1 internal anchor

[1]

& Camps-Valls, G

Runge, J., Gerhardus, A., Varando, G., Eyring, V. & Camps-Valls, G. Causal inference for time series.Nature Reviews Earth & Environment.4, 487-505 (2023)

work page 2023
[2]

& Mackenzie, D

Pearl, J. & Mackenzie, D. The book of why: the new science of cause and effect. (Basic books,2018)

work page 2018
[3]

Causality

Pearl, J. Causality. (Cambridge University Press, 2009)

work page 2009
[4]

& Walsh, J

Chapman, W., Welch, W., Bowman, K., Sacks, J. & Walsh, J. Arctic sea ice vari- ability: Model sensitivities and a multidecadal simulation.Journal of Geophysical Research: Oceans.99, 919-935 (1994)

work page 1994
[5]

& Johnson, C

Walsh, J. & Johnson, C. Interannual atmospheric variability and associated fluc- tuations in Arctic sea ice extent.Journal of Geophysical Research: Oceans.84, 6915-6928 (1979)

work page 1979
[6]

& Marotzke, J

Notz, D. & Marotzke, J. Observations reveal external driver for Arctic sea-ice retreat.Geophysical Research Letters.39(2012)

work page 2012
[7]

& Prabhat, F

Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., Carvalhais, N. & Prabhat, F. Deep learning and process understanding for data-driven Earth system science.Nature.566, 195-204 (2019)

work page 2019
[8]

& Sch¨ olkopf, B

Peters, J., Janzing, D. & Sch¨ olkopf, B. Elements of causal inference: foundations and learning algorithms. (The MIT Press, 2017)

work page 2017
[9]

& Bengio, Y

Ahuja, K., Mahajan, D., Wang, Y. & Bengio, Y. Interventional causal representa- tion learning.International Conference on Machine Learning. pp. 372-407 (2023)

work page 2023
[10]

& Zhuang, F

Xu, H., Xu, Y., Li, C. & Zhuang, F. Causal structure representation learning of unobserved confounders in latent space for recommendation.ACM Transactions on Information Systems. (2025)

work page 2025
[11]

& Sun, F

Gao, H., Li, J., Qiang, W., Si, L., Xu, B., Zheng, C. & Sun, F. Robust causal graph representation learning against confounding effects.Proceedings of the AAAI Conference on Artificial Intelligence.37, 7624-7632 (2023)

work page 2023
[12]

& Sejdinovic, D

Runge, J., Nowack, P., Kretschmer, M., Flaxman, S. & Sejdinovic, D. Detecting and quantifying causal associations in large nonlinear time series datasets.Science Advances.5, eaau4996 (2019)

work page 2019
[13]

& Chen, H

Wu, T., Wu, X., Wang, X., Liu, S. & Chen, H. Nonlinear causal discovery in time series.Proceedings of the 31st ACM International Conference on Information & Knowledge Management. pp. 4575-4579 (2022)

work page 2022
[14]

& Bengio, Y

Sch¨ olkopf, B., Locatello, F., Bauer, S., Ke, N., Kalchbrenner, N., Goyal, A. & Bengio, Y. Toward causal representation learning.Proceedings of the IEEE.109, 612-634 (2021)

work page 2021
[15]

An introduction to propensity score methods for reducing the effects of confounding in observational studies.Multivariate Behavioral Research.46, 399- 424 (2011)

Austin, P. An introduction to propensity score methods for reducing the effects of confounding in observational studies.Multivariate Behavioral Research.46, 399- 424 (2011)

work page 2011
[16]

& McCulloch, R

Chipman, H., George, E. & McCulloch, R. BART: Bayesian additive regression trees. (2010)

work page 2010
[17]

& Athey, S

Wager, S. & Athey, S. Estimation and inference of heterogeneous treatment effects using random forests.Journal of the American Statistical Association.113, 1228- 1242 (2018). Title Suppressed Due to Excessive Length 17

work page 2018
[18]

Dud´ ık, M., Langford, J. & Li, L. Doubly robust policy evaluation and learning. ArXiv Preprint ArXiv:1103.4601. (2011)

work page internal anchor Pith review Pith/arXiv arXiv 2011
[19]

& Imbens, G

Athey, S. & Imbens, G. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences.113, 7353-7360 (2016)

work page 2016
[20]

& Hansen, C

Belloni, A., Chernozhukov, V. & Hansen, C. Inference on treatment effects after selection among high-dimensional controls.Review of Economic Studies.81, 608- 650 (2014)

work page 2014
[21]

& Sontag, D

Shalit, U., Johansson, F. & Sontag, D. Estimating individual treatment effect: gen- eralization bounds and algorithms.International Conference on Machine Learning. pp. 3076-3085 (2017)

work page 2017
[22]

& Sontag, D

Johansson, F., Shalit, U. & Sontag, D. Learning representations for counterfactual inference.International Conference on Machine Learning. pp. 3020-3029 (2016)

work page 2016
[23]

& Zhang, A

Yao, L., Li, S., Li, Y., Huai, M., Gao, J. & Zhang, A. Representation learning for treatment effect estimation from observational data.Advances in Neural Informa- tion Processing Systems.31(2018)

work page 2018
[24]

& Tibshirani, R

Tian, L., Alizadeh, A., Gentles, A. & Tibshirani, R. A simple method for estimating interactions between a treatment and a large number of covariates.Journal of the American Statistical Association.109, 1517-1532 (2014)

work page 2014
[25]

& Schaar, M

Berrevoets, J., Curth, A., Bica, I., McKinney, E. & Schaar, M. Disentangled counterfactual recurrent networks for treatment effect inference over time.ArXiv Preprint ArXiv:2112.03811. (2021)

work page arXiv 2021
[26]

& Liu, B

Zhu, H., Huang, H., Yin, K., Fan, Z., Jin, H. & Liu, B. CausalNET: Unveiling causal structures on event sequences by topology-informed causal attention.Proceedings Of The IJCAI. pp. 7144-7152 (2024)

work page 2024
[27]

& Sontag, D

Shalit, U., Johansson, F. & Sontag, D. Estimating individual treatment effect: gen- eralization bounds and algorithms.International Conference On Machine Learn- ing. pp. 3076-3085 (2017)

work page 2017
[28]

& Wekerle, C

Wang, Q., Danilov, S., Mu, L., Sidorenko, D. & Wekerle, C. Lasting impact of winds on Arctic sea ice through the ocean’s memory.The Cryosphere Discussions. 2021pp. 1-31 (2021)

work page 2021
[29]

Mo, Z., Liu, Q., Yan, B., Zhang, L. & Di, X. Causal adjacency learning for spa- tiotemporal prediction over graphs.2024 IEEE 27th International Conference On Intelligent Transportation Systems (ITSC). pp. 621-626 (2024)

work page 2024
[30]

& Wang, J

Ng, I., Zhu, S., Fang, Z., Li, H., Chen, Z. & Wang, J. Masked gradient-based causal structure learning.Proceedings Of The 2022 SIAM International Conference On Data Mining (SDM). pp. 424-432 (2022)

work page 2022