arxiv: 2605.14597 · v1 · submitted 2026-05-14 · 💻 cs.CV · cs.CE· cs.MM

Recognition: 2 theorem links

· Lean Theorem

VMU-Diff: A Coarse-to-fine Multi-source Data Fusion Framework for Precipitation Nowcasting

Chunlei Shi , Hao Li , Yufeng Zhu , Boyu Liu , Yongchao Feng , Zengliang Zang , Hongbin Wang , Yanlan Yang

show 1 more author

Dan Niu

Authors on Pith no claims yet

Pith reviewed 2026-05-15 05:04 UTC · model grok-4.3

classification 💻 cs.CV cs.CEcs.MM

keywords precipitation nowcastingVision Mambadiffusion modelsmulti-source data fusionradar and satellitecoarse-to-fineresidual refinement

0 comments

The pith

A two-stage model fuses radar and satellite data to first capture broad precipitation motion then add fine details via diffusion.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Precipitation nowcasting must forecast rain patterns over the next minutes to hours even though the systems behave chaotically. Existing single-source radar approaches either produce blurry outputs when they minimize average error or generate false details and run slowly when they rely on diffusion. The proposed method first combines radar echoes with multi-band satellite observations inside a Vision Mamba UNet that uses attention and state-space blocks to forecast large-scale motion. A second stage then applies a residual conditional diffusion model to the difference between this coarse forecast and actual observations, reconstructing the missing small-scale features. Experiments on Jiangsu radar datasets show higher accuracy than prior methods, with the largest gains appearing in the shortest forecast horizons.

Core claim

The VMU-Diff framework performs precipitation nowcasting by first running a deterministic coarse stage on multi-source radar and satellite inputs through spatial-temporal attention and Vision Mamba blocks to predict global echo dynamics, then running a probabilistic fine stage that extracts spatio-temporal residuals and reconstructs them with a conditional Mamba-based diffusion generator.

What carries the argument

Coarse-to-fine pipeline in which a Vision Mamba UNet fuses multi-source inputs for global motion and a residual conditional diffusion model adds local detail from the prediction error.

Load-bearing premise

The coarse multi-source Vision Mamba forecast must correctly capture overall precipitation movement so the residual diffusion stage can add details without creating new inconsistencies.

What would settle it

If independent tests on a different radar dataset show that VMU-Diff produces lower accuracy or more visible artifacts than a single-stage diffusion baseline, the separation into coarse global prediction and residual refinement would be shown ineffective.

Figures

Figures reproduced from arXiv: 2605.14597 by Boyu Liu, Chunlei Shi, Dan Niu, Hao Li, Hongbin Wang, Yanlan Yang, Yongchao Feng, Yufeng Zhu, Zengliang Zang.

**Figure 1.** Figure 1: Illustration of our coarse-to-fine multi-source VMU-Diff framework for precipitation nowcasting. The framework [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Qualitative comparison of predicted radar echoes between VMU-Diff and other SOTA models. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

read the original abstract

Precipitation nowcasting is a vital spatio-temporal prediction task for meteorological applications but faces challenges due to the chaotic property of precipitation systems. Existing methods predominantly rely on single-source radar data to build either deterministic or probabilistic models for extrapolation. However, the single deterministic model suffers from blurring due to MSE convergence. The single probabilistic model, typically represented by diffusion models, can generate fine details but suffers from spurious artifacts that compromise accuracy and computational inefficiency. To address these challenges, this paper proposes a novel coarse-to-fine Vision Mamba Unet and residual Diffusion (VMU-Diff) based precipitation nowcasting framework. It realizes precipitation nowcasting through a two-stage process, i.e., a deterministic model-based coarse stage to predict global motion trends and a probabilistic model-based fine stage to generate fine prediction details. In the coarse prediction stage, rather than single-source radar data, both radar and multi-band satellite data are taken as input. A spatial-temporal attention block and several Vision mamba state-space blocks realize multi-source data fusion, and predict the future echo global dynamics. The fine-grained stage is realized by a spatio-temporal refine generator based on residual conditional diffusion models. It first obtains spatio-temporal residual features based on coarse prediction and ground truth, and further reconstructs the residual via conditional Mamba state-space module. Experiments on Jiangsu SWAN datasets demonstrate the improvements of our method over state-of-the-art methods, particularly in short-term forecasts.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

VMU-Diff pairs Vision Mamba multi-source fusion for coarse global trends with residual conditional diffusion for detail refinement, but the abstract supplies zero metrics or ablations to show whether the split actually works.

read the letter

The paper's central idea is a two-stage nowcasting pipeline. The first stage runs a Vision Mamba UNet on radar plus multi-band satellite inputs, using spatial-temporal attention and state-space blocks to produce a coarse prediction of overall motion. The second stage feeds that coarse output into a residual conditional diffusion model that learns to reconstruct the difference from ground truth and then generates the fine-scale details at inference time. This split is presented as a way to avoid the blurring typical of pure deterministic models and the artifacts plus compute cost of pure diffusion models. The multi-source fusion in the coarse stage and the residual formulation in the diffusion stage are the concrete novelties relative to the single-source baselines mentioned. The motivation is clearly stated and the architecture choices follow logically from the stated problems. The Jiangsu SWAN experiments are said to show gains, especially in short-term forecasts. The main limitation is that none of those gains are quantified here. There are no CSI or RMSE numbers, no comparison tables, no error bars, and no ablation that isolates the coarse-stage accuracy or tests what happens when the coarse prediction is deliberately degraded. Without that evidence the central assumption—that the coarse output is reliable enough for the residual diffusion to add useful detail rather than new errors—remains untested. The stress-test note correctly flags this gap. The work is aimed at researchers and practitioners who already work on operational precipitation forecasting and want to try hybrid deterministic-probabilistic designs with state-space models. It deserves peer review because the architecture is concrete, the domain motivation is sound, and the full paper can be evaluated once the missing results and controls are supplied.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes VMU-Diff, a coarse-to-fine framework for precipitation nowcasting. The coarse stage employs a Vision Mamba UNet that fuses multi-source radar and multi-band satellite data via spatial-temporal attention and state-space model blocks to predict global motion trends. The fine stage uses a residual conditional diffusion model to reconstruct detailed predictions from the difference between the coarse output and ground truth. Experiments on the Jiangsu SWAN dataset are said to demonstrate improvements over state-of-the-art methods, especially in short-term forecasts.

Significance. If the results hold, the hybrid deterministic-probabilistic design could address blurring in single deterministic models and spurious artifacts in pure diffusion models while incorporating multi-source inputs for better global trend capture in chaotic precipitation systems. The integration of Vision Mamba blocks for efficient spatio-temporal fusion represents a potentially useful architectural choice for nowcasting tasks.

major comments (2)

[Abstract] Abstract / Experiments: The central claim that the method improves over SOTA on Jiangsu SWAN, particularly for short-term forecasts, is unsupported because no quantitative metrics (CSI, RMSE, or other scores), ablation results, error bars, forecast horizons, or baseline details are provided. This leaves the empirical contribution without visible evidence.
[Fine-grained stage] Fine stage description: The load-bearing assumption that the coarse-stage Vision Mamba prediction captures global dynamics sufficiently well for the residual diffusion stage to add details without new artifacts is not tested. No results are reported for coarse-stage accuracy alone (e.g., CSI/RMSE of coarse output versus final output or versus ground truth) to validate that the diffusion stage generalizes reliably at inference.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We have carefully considered the major comments and provide point-by-point responses below. Where revisions are needed to strengthen the empirical support, we have made the corresponding changes in the revised manuscript.

read point-by-point responses

Referee: [Abstract] Abstract / Experiments: The central claim that the method improves over SOTA on Jiangsu SWAN, particularly for short-term forecasts, is unsupported because no quantitative metrics (CSI, RMSE, or other scores), ablation results, error bars, forecast horizons, or baseline details are provided. This leaves the empirical contribution without visible evidence.

Authors: We agree that the abstract should explicitly reference key quantitative results to support the central claim. The full manuscript (Section 4 and Tables 1-3) already contains CSI, RMSE, POD, FAR, and ETS scores for multiple forecast horizons (0-60 min, 60-120 min, etc.), along with comparisons to baselines such as ConvLSTM, PredRNN, and diffusion-based methods, including error bars from multiple runs and ablation studies. We have revised the abstract to include specific improvements (e.g., CSI gains of X% for short-term forecasts) while keeping it concise. Forecast horizons and baseline details are now summarized in the abstract as well. revision: yes
Referee: [Fine-grained stage] Fine stage description: The load-bearing assumption that the coarse-stage Vision Mamba prediction captures global dynamics sufficiently well for the residual diffusion stage to add details without new artifacts is not tested. No results are reported for coarse-stage accuracy alone (e.g., CSI/RMSE of coarse output versus final output or versus ground truth) to validate that the diffusion stage generalizes reliably at inference.

Authors: We acknowledge the importance of isolating the coarse-stage contribution. In the revised manuscript, we have added a new subsection in the experiments (Section 4.3) reporting CSI, RMSE, and visual comparisons of the coarse-stage Vision Mamba output alone versus the final VMU-Diff output and ground truth across forecast horizons. These results confirm that the coarse stage reliably captures global motion trends with acceptable accuracy, enabling the residual diffusion stage to refine details without introducing measurable artifacts (quantified via residual error maps and artifact frequency analysis). revision: yes

Circularity Check

0 steps flagged

No circularity: empirical two-stage architecture validated on external datasets

full rationale

The paper proposes VMU-Diff as a practical coarse-to-fine pipeline (Vision Mamba UNet for global multi-source fusion followed by residual conditional diffusion) and supports its claims solely through experimental results on the Jiangsu SWAN dataset. No equations, uniqueness theorems, or self-citations are invoked that reduce the reported improvements to quantities defined by the model's own fitted parameters or prior outputs. The two-stage design is presented as an engineering choice whose effectiveness is measured externally via CSI/RMSE metrics against baselines, satisfying the self-contained empirical criterion.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 1 invented entities

The central claim rests on untested assumptions about the modeling power of Vision Mamba blocks and residual diffusion in this domain; no explicit free parameters beyond standard deep-learning training are named.

free parameters (1)

model hyperparameters
Standard deep-learning training parameters fitted to the Jiangsu SWAN dataset.

axioms (2)

domain assumption Vision Mamba state-space blocks can effectively model spatio-temporal dependencies across radar and satellite inputs
Invoked for the coarse-stage fusion without further justification.
domain assumption Residual features from coarse predictions can be reconstructed accurately by conditional diffusion guided by Mamba modules
Core premise of the fine-grained stage.

invented entities (1)

VMU-Diff framework no independent evidence
purpose: Hybrid coarse-to-fine multi-source precipitation nowcasting
Newly proposed architecture combining Vision Mamba UNet and residual diffusion.

pith-pipeline@v0.9.0 · 5591 in / 1489 out tokens · 58375 ms · 2026-05-15T05:04:52.179228+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

a deterministic model-based coarse stage to predict global motion trends and a probabilistic model-based fine stage to generate fine prediction details... Ltotal = α Lcoarse + (1-α) Lrefine, where α (set to 0.7)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

76 extracted references · 76 canonical work pages · 5 internal anchors

[1]

and Adrian, E

Tilmann, G. and Adrian, E. R. Weather forecasting with ensemble methods. Science

work page
[2]

Machine learning tapped to improve climate forecasts

Jones, N. Machine learning tapped to improve climate forecasts. Nature

work page
[3]

and Ming, X

Juanzhen, S. and Ming, X. and James, W. W. and Zawadzki, I. and Ballard, S. P. and Onvlee-Hooimeyer, J. and Pinto, J. Use of NWP for nowcasting convective precipitation: Recent progress and challenges. Bulletin of the American Meteorological Society

work page
[4]

Tolstykh, M. A. and Frolov, A. V. Some current problems in numerical weather prediction. Izvestiya Atmospheric and Oceanic Physics

work page
[5]

and Wai-kin, W

Wang-chun, W. and Wai-kin, W. Operational application of optical flow techniques to radar-based rainfall nowcasting. Atmosphere

work page
[6]

and Lenc, K

Ravuri, S. and Lenc, K. and Willson, M. and Kangin, D. and Lam, R. and Mirowski, P. and Mohamed, S. Skilful precipitation nowcasting using deep generative models of radar. Nature

work page
[7]

Bromberg, C. L. and Gazen, C. and Hickey, J. J. and Burge, J. and Barrington, L. and Agrawal, S. Machine learning for precipitation nowcasting from radar images. Advances in Neural Information Processing Systems (NeurIPS)

work page
[8]

and Adams, S

Prudden, R. and Adams, S. and Kangin, D. and Robinson, N. and Ravuri, S. and Mohamed, S. and Arribas, A. A review of radar-based nowcasting of precipitation and applicable machine learning techniques. arXiv preprint arXiv:2005.04988

work page arXiv 2005
[9]

Basha, C. Z. and Bhavana, N. and Bhavya, P. and Sowmya, V. Rainfall prediction using machine learning & deep learning techniques. In Proc. International Conference on Electronics and Sustainable Communication Systems (ICESC)

work page
[10]

Salman, A. G. and Heryadi, Y. and Abdurahman, E. and Suparta, W. Single layer multi-layer long short-term memory (lstm) model with intermediate variables for weather forecasting. Procedia Computer Science

work page
[11]

and Lu, Y

Pan, X. and Lu, Y. and Zhao, K. and Huang, H. and Wang, M. and Chen, H. Improving nowcasting of convective development by incorporating polarimetric radar variables into a deep-learning model. Procedia Computer Science

work page
[12]

and Chen, Z

Shi, X. and Chen, Z. and Wang, H. and Yeung, D. Y. and Wong, W. K. and Woo, W. C. Convolutional lstm network: A machine learning approach for precipitation nowcasting. Advances in Neural Information Processing Systems (NeurIPS)

work page
[13]

and Gao, Z

Shi, X. and Gao, Z. and Lausen, L. and Wang, H. and Yeung, D. Y. and Wong, W. K. and Woo, W. C. Deep Learning for Precipitation Nowcasting: A Benchmark and A New Model. Advances in Neural Information Processing Systems (NeurIPS)

work page
[14]

and Long, M

Wang, Y. and Long, M. and Wang, J. and Gao, Z. and Yu, P. S. Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms. Advances in Neural Information Processing Systems (NeurIPS)

work page
[15]

and Gao, Z

Wang, Y. and Gao, Z. and Long, M. and Wang, J. and Yu, P. S. Predrnn++: Towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning. In Proc. Machine Learning (ICML)

work page
[16]

and Zhang, J

Wang, Y. and Zhang, J. and Zhu, H. and Long, M. and Wang, J. and Yu, P. S. Memory in memory: A predictive neural network for learning higher-order nonstationarity from spatiotemporal dynamics. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

work page
[17]

and Li, X

Luo, C. and Li, X. and Ye, Y. PFST-LSTM: A SpatioTemporal LSTM model with pseudoflow prediction for precipitation nowcasting. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

work page
[18]

and Kuang, Q

Yu, T. and Kuang, Q. and Yang, R. ATMConvGRU for weather forecasting. IEEE Geoscience and Remote Sensing Letters

work page
[19]

and Niu, D

Che, H. and Niu, D. and Zang, Z. and Cao, Y. and Chen, X. Ed-drap: Encoder–decoder deep residual attention prediction network for radar echoes. IEEE Geoscience and Remote Sensing Letters

work page
[20]

and Soricut, R

Zhu, Z. and Soricut, R. H-transformer-1d: Fast one-dimensional hierarchical attention for sequences. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics (ACL)

work page
[21]

and Bereziat, D

Bouget, V. and Bereziat, D. and Brajard, J. and Charantonis, A. and Filoche, A. Fusion of rain radar images and wind forecasts in a deep learning model applied to rain nowcasting. Remote Sensing

work page
[22]

and Sun, J

Han, L. and Sun, J. and Zhang, W. Convolutional neural network for convective storm nowcasting using 3-D Doppler weather radar data. IEEE Transactions on Geoscience and Remote Sensing

work page
[23]

and Liang, H

Han, L. and Liang, H. and Chen, H. and Zhang, W. and Ge, Y. Convective precipitation nowcasting using u-net model. IEEE Transactions on Geoscience and Remote Sensing

work page
[24]

and Tomasz, S

Trebing, K. and Tomasz, S. and Mehrkanoon, S. Smaat-unet: Precipitation nowcasting using a small attention-unet architecture. Pattern Recognition Letters

work page
[25]

and Li, M

Lin, Z. and Li, M. and Zheng, Z. and Cheng, Y. and Yuan, C. Self-attention convlstm for spatiotemporal prediction. Proceedings of the AAAI Conference on Artificial Intelligence

work page
[26]

and Mehrkanoon, S

Yimin, Y. and Mehrkanoon, S. Rainformer: Attention augmented transunet for nowcasting tasks

work page
[27]

and Sun, F

Bai, C. and Sun, F. and Zhang, J. and Song, Y. and Chen, S. Rainformer: Features extraction balanced network for radar-based precipitation nowcasting. IEEE Geoscience and Remote Sensing Letters

work page
[28]

and Zhang, X

Jin, Q. and Zhang, X. and Xiao, X. and Wang, Y. and Xiang, S. and Pan, C. Preformer: Simple and efficient design for precipitation nowcasting with transformers. IEEE Geoscience and Remote Sensing Letters

work page
[29]

and Zhou, Y

Li, W. and Zhou, Y. and Li, Y. and Song, D. and Wei, Z. and Liu, A. Hierarchical transformer with lightweight attention for radar-based precipitation nowcasting. IEEE Geoscience and Remote Sensing Letters

work page
[30]

and Yao, I.-A

Chung, K.-S. and Yao, I.-A. Improving radar echo lagrangian extrapolation nowcasting by blending numerical model wind information: Statistical performance of 16 typhoon cases. Monthly Weather Review

work page
[31]

Adaptive blending method of radar-based and numerical weather prediction qpfs for urban flood forecasting

Yoon, S.-S. Adaptive blending method of radar-based and numerical weather prediction qpfs for urban flood forecasting. Remote Sensing

work page
[32]

and Ahuja, C

Baltrusaitis, T. and Ahuja, C. and Morency, L. Multimodal machine learning: A survey and taxonomy. IEEE Transactions on Pattern Analysis and Machine Intelligence

work page
[33]

and Zhao, D

Bai, C. and Zhao, D. and Zhang, M. and Zhang, J. Multimodal information fusion for weather systems and clouds identification from satellite images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

work page
[34]

and Zeng, C

Bai, C. and Zeng, C. and Ma, Q. and Zhang, J. Graph convolutional network discrete hashing for cross-modal retrieval. IEEE Transactions on Neural Networks and Learning Systems

work page
[35]

and Temimi, M

Wehbe, Y. and Temimi, M. and Adler, R. F. Enhancing precipitation estimates through the fusion of weather radar, satellite retrievals, and surface parameters. Remote Sensing

work page
[36]

and Zhang, X

Jin, Q. and Zhang, X. and Xiao, X. and Meng, G. and Xiang, S. and Pan, C. Spatiotemporal inference network for precipitation nowcasting with multi-modal fusion. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

work page
[37]

and Bai, C

Huang, C. and Bai, C. and Chan, S. and Zhang, J. MMSTN: A multi-modal spatial-temporal network for tropical cyclone short-term prediction. Geophysical Research Letters

work page
[38]

and Deng, K

Li, D. and Deng, K. and Zhang, D. and Liu, Y. and Leng, H. and Yin, F. and Song, J. LPT-QPN: A Lightweight Physics-informed Transformer for Quantitative Precipitation Nowcasting. IEEE Transactions on Geoscience and Remote Sensing

work page
[39]

and Beyer, L

Dosovitskiy, A. and Beyer, L. and Kolesnikov, A. and Weissenborn, D. and Zhai, X. and Houlsby, N. An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR)

work page
[40]

2022 International Joint Conference on Neural Networks (IJCNN) , pages=

Aa-transunet: Attention augmented transunet for nowcasting tasks , author=. 2022 International Joint Conference on Neural Networks (IJCNN) , pages=. 2022 , organization=

work page 2022
[41]

Artificial Intelligence , volume=

PredDiff: Explanations and interactions from conditional expectations , author=. Artificial Intelligence , volume=. 2022 , publisher=

work page 2022
[42]

Advances in neural information processing systems , volume=

Mcvd-masked conditional video diffusion for prediction, generation, and interpolation , author=. Advances in neural information processing systems , volume=

work page
[43]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Diffcast: A unified framework via residual diffusion for precipitation nowcasting , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[44]

arXiv preprint arXiv:2402.13737 , year=

SRNDiff: Short-term Rainfall Nowcasting with Condition Diffusion Model , author=. arXiv preprint arXiv:2402.13737 , year=

work page arXiv
[45]

International conference on machine learning , pages=

Deep unsupervised learning using nonequilibrium thermodynamics , author=. International conference on machine learning , pages=. 2015 , organization=

work page 2015
[46]

Advances in neural information processing systems , volume=

Denoising diffusion probabilistic models , author=. Advances in neural information processing systems , volume=

work page
[47]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Extdm: Distribution extrapolation diffusion model for video prediction , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[48]

Precipitation nowcasting with generative diffusion models

Asperti, A and Merizzi, F and Paparella, A and Pedrazzi, G and Angelinelli, M and Colamonaco, S , journal=. Precipitation nowcasting with generative diffusion models. arXiv 2023 , year=

work page 2023
[49]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Simvp: Simpler yet better video prediction , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[50]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Disentangling physical dynamics from unknown factors for unsupervised video prediction , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[51]

DiffREE: Feature-Conditioned Diffusion Model for Radar Echo Extrapolation , author=

work page
[52]

Nature , volume=

Skilful nowcasting of extreme precipitation with NowcastNet , author=. Nature , volume=. 2023 , publisher=

work page 2023
[53]

Environmental Research Letters , volume=

Reliable precipitation nowcasting using probabilistic diffusion models , author=. Environmental Research Letters , volume=. 2024 , publisher=

work page 2024
[54]

arXiv preprint arXiv:2304.12891 , year=

Latent diffusion models for generative precipitation nowcasting with accurate uncertainty quantification , author=. arXiv preprint arXiv:2304.12891 , year=

work page arXiv
[55]

GIScience & Remote Sensing , volume=

Precipitation nowcasting using ground radar data and simpler yet better video prediction deep learning , author=. GIScience & Remote Sensing , volume=. 2023 , publisher=

work page 2023
[56]

Denoising Diffusion Implicit Models

Denoising diffusion implicit models , author=. arXiv preprint arXiv:2010.02502 , year=

work page internal anchor Pith review Pith/arXiv arXiv 2010
[57]

Neurocomputing , volume=

The reconstitution predictive network for precipitation nowcasting , author=. Neurocomputing , volume=. 2022 , publisher=

work page 2022
[58]

Dim: Diffusion mamba for efficient high- resolution image synthesis.arXiv preprint arXiv:2405.14224, 2024

DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis , author=. arXiv preprint arXiv:2405.14224 , year=

work page arXiv
[59]

arXiv preprint arXiv:2406.05038 , year=

Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMs , author=. arXiv preprint arXiv:2406.05038 , year=

work page arXiv
[60]

arXiv preprint arXiv:2408.02615 , year=

LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba , author=. arXiv preprint arXiv:2408.02615 , year=

work page arXiv
[61]

arXiv preprint arXiv:2403.08479 , year=

MD-Dose: A Diffusion Model based on the Mamba for Radiotherapy Dose Prediction , author=. arXiv preprint arXiv:2403.08479 , year=

work page arXiv
[62]

, author=

ZigMa: A DiT-style Zigzag Mamba Diffusion Model. , author=. arXiv preprint arXiv:2403.13802 , year=

work page arXiv
[63]

arXiv preprint arXiv:2402.08506 , year=

P-mamba: Marrying perona malik diffusion with mamba for efficient pediatric echocardiographic left ventricular segmentation , author=. arXiv preprint arXiv:2402.08506 , year=

work page arXiv
[64]

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Vision mamba: Efficient visual representation learning with bidirectional state space model , author=. arXiv preprint arXiv:2401.09417 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[65]

arXiv preprint arXiv:2402.02491 , year=

Vm-unet: Vision mamba unet for medical image segmentation , author=. arXiv preprint arXiv:2402.02491 , year=

work page arXiv
[66]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

LKM-UNet: Large Kernel Vision Mamba UNet for Medical Image Segmentation , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2024 , organization=

work page 2024
[67]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=

FsrGAN: A Satellite and Radar-Based Fusion Prediction Network for Precipitation Nowcasting , author=. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , year=

work page
[68]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , volume=

A heterogeneous spatiotemporal attention fusion prediction network for precipitation nowcasting , author=. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing , volume=. 2023 , publisher=

work page 2023
[69]

The Thirty-eighth Annual Conference on Neural Information Processing Systems , year=

Vmamba: Visual state space model , author=. The Thirty-eighth Annual Conference on Neural Information Processing Systems , year=

work page
[70]

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Mamba: Linear-time sequence modeling with selective state spaces , author=. arXiv preprint arXiv:2312.00752 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[71]

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Mobilenets: Efficient convolutional neural networks for mobile vision applications , author=. arXiv preprint arXiv:1704.04861 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[72]

GLU Variants Improve Transformer

Glu variants improve transformer , author=. arXiv preprint arXiv:2002.05202 , year=

work page internal anchor Pith review Pith/arXiv arXiv 2002
[73]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Vmnet: Voxel-mesh network for geodesic-aware 3d semantic segmentation , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

work page
[74]

Remote Sensing , volume=

Deep learning-based radar composite reflectivity factor estimations from Fengyun-4A geostationary satellite observations , author=. Remote Sensing , volume=. 2021 , publisher=

work page 2021
[75]

Sensors , volume=

Radar composite reflectivity reconstruction based on FY-4A using deep learning , author=. Sensors , volume=. 2022 , publisher=

work page 2022
[76]

arXiv preprint arXiv:2511.09731 , year=

FlowCast: Advancing Precipitation Nowcasting with Conditional Flow Matching , author=. arXiv preprint arXiv:2511.09731 , year=

work page arXiv