Fast Kernel-Space Diffusion for Remote Sensing Pansharpening

Hancong Jin; Jingjing Li; Liang-Jian Deng; Zihan Cao

arxiv: 2505.18991 · v3 · pith:Y7GV6ZASnew · submitted 2025-05-25 · 💻 cs.CV

Fast Kernel-Space Diffusion for Remote Sensing Pansharpening

Hancong Jin , Zihan Cao , Liang-jian Deng , Jingjing Li This is my paper

Pith reviewed 2026-05-22 01:18 UTC · model grok-4.3

classification 💻 cs.CV

keywords pansharpeningdiffusion modelsremote sensingkernel generationlow-rank tensorsmulti-head attentionimage fusionfast inference

0 comments

The pith

KSDiff shifts diffusion to kernel space to fuse satellite images with over 500 times faster inference and better quality.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Pansharpening fuses high-resolution panchromatic images with low-resolution multispectral data to produce outputs rich in both spatial detail and spectral information. Existing deep learning methods often miss the broad statistical patterns in remote sensing scenes, while diffusion models that could capture those patterns run too slowly for real use. KSDiff moves the diffusion process into the generation of convolutional kernels that already embed global context from the data. These kernels are built by combining a low-rank core tensor generator with a unified factor generator under the direction of structure-aware multi-head attention. A two-stage training procedure lets the module drop into existing pansharpening networks, delivering higher-quality results at more than 500 times the speed of prior diffusion baselines.

Core claim

KSDiff constructs convolutional kernels enriched with global context through the integration of a low-rank core tensor generator and a unified factor generator, orchestrated by a structure-aware multi-head attention mechanism. This kernel-space diffusion approach, supported by a two-stage training strategy, allows integration into standard pansharpening pipelines while capturing global priors inherent in remote sensing data distributions.

What carries the argument

low-rank core tensor generator and unified factor generator orchestrated by structure-aware multi-head attention to produce global-context-enriched convolutional kernels

If this is right

Pansharpening outputs achieve higher quality than recent competing methods on standard evaluation metrics.
Inference runs more than 500 times faster than existing diffusion-based pansharpening models.
The module integrates into existing pansharpening architectures through a two-stage training procedure.
Global priors from remote sensing distributions are captured without direct pixel-space diffusion.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The kernel-generation strategy could be tested on other remote-sensing fusion tasks such as hyperspectral sharpening.
Real-time processing pipelines for satellite imagery streams may become feasible with this level of acceleration.
Tensor-factorization patterns from the method might transfer to efficiency improvements in related enhancement models.

Load-bearing premise

Integrating a low-rank core tensor generator and unified factor generator with structure-aware multi-head attention will reliably capture global priors in remote sensing data distributions while delivering the claimed inference speedup without quality loss.

What would settle it

Direct timing and quality measurements on standard remote sensing benchmark datasets that show less than 500-fold inference speedup or lower pansharpening metrics than diffusion baselines would falsify the central performance claims.

Figures

Figures reproduced from arXiv: 2505.18991 by Hancong Jin, Jingjing Li, Liang-Jian Deng, Zihan Cao.

**Figure 2.** Figure 2: Kernel Generator of our proposed KSDiff. The kernel generator comprises two sub-modules: (1) a diffusion model-driven [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Pyramid Latent Fusion Encoder (PLFE). The figure [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: An overview of our two-stage training procedure and inference process. (a) [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison of qualitative results for representative methods on the GF2 reduced-resolution dataset. The first row displays RGB [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Visualization of latent representations generated by KSDiff from PAN and LRMS images across different scenes. [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Baseline network. the main text. All core components, including the crossattention mechanism and the Fusion-Gate module, remain identical to those in PLFE1. The role of PLFE2 differs from PLFE1. In PLFE1, the module encodes the groundtruth , PAN, LRMS images to obtain a latent representation z0 ∈ R N×Cz . In contrast, PLFE2 produces a conditioning vector c ∈ R N×Cz only with PAN and LRMS images as inputs… view at source ↗

**Figure 8.** Figure 8: (a) PLFE2. (b) The cross attention in latent encoders. 7. Details on Experiments 7.1. Datasets We conducted experiments using datasets derived from WorldView-3 (WV3), QuickBird (QB), and GaoFen-2 (GF2) satellite imagery. These datasets consist of image patches cropped from full remote sensing scenes and are partitioned into training and testing subsets. The WV3 dataset contains four images from two geograp… view at source ↗

**Figure 9.** Figure 9: Comparison of qualitative results for representative methods on the WV3 reduced-resolution dataset. [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗

**Figure 10.** Figure 10: Comparison of qualitative results for representative methods on the WV3 full-resolution dataset. [PITH_FULL_IMAGE:figures/full_fig_p016_10.png] view at source ↗

**Figure 11.** Figure 11: Comparison of qualitative results for representative methods on the GF2 reduced-resolution dataset. [PITH_FULL_IMAGE:figures/full_fig_p017_11.png] view at source ↗

**Figure 12.** Figure 12: Comparison of qualitative results for representative methods on the GF2 full-resolution dataset. [PITH_FULL_IMAGE:figures/full_fig_p017_12.png] view at source ↗

**Figure 13.** Figure 13: Comparison of qualitative results for representative methods on the QB reduced-resolution dataset. [PITH_FULL_IMAGE:figures/full_fig_p018_13.png] view at source ↗

**Figure 14.** Figure 14: Comparison of qualitative results for representative methods on the QB full-resolution dataset. [PITH_FULL_IMAGE:figures/full_fig_p018_14.png] view at source ↗

read the original abstract

Pansharpening seeks to fuse high-resolution panchromatic (PAN) and low-resolution multispectral (LRMS) images into a single image with both fine spatial and rich spectral detail. Despite progress in deep learning-based approaches, existing methods often fail to capture global priors inherent in remote sensing data distributions. Diffusion-based models have recently emerged as promising solutions due to their powerful distribution mapping capabilities, however, they suffer from heavy inference latency. We introduce KSDiff, a fast kernel-space diffusion framework that generates convolutional kernels enriched with global context to enhance pansharpening quality and accelerate inference. Specifically, KSDiff constructs these kernels through the integration of a low-rank core tensor generator and a unified factor generator, orchestrated by a structure-aware multi-head attention mechanism. We further introduce a two-stage training strategy tailored for pansharpening, facilitating integration into existing pansharpening architectures. Experiments show that KSDiff achieves superior performance compared to recent promising methods, and with over $500 \times$ faster inference than diffusion-based pansharpening baselines. Ablation studies, visualizations and further evaluations substantiate the effectiveness of our approach. Code will be released upon possible acceptance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces KSDiff, a kernel-space diffusion framework for remote sensing pansharpening. It generates convolutional kernels enriched with global context via a low-rank core tensor generator integrated with a unified factor generator, orchestrated by structure-aware multi-head attention. A two-stage training strategy allows integration into existing pansharpening architectures. Experiments claim superior performance over recent methods together with over 500× faster inference than diffusion-based pansharpening baselines, supported by ablations and visualizations.

Significance. If the performance and speedup claims are substantiated with full quantitative evidence, the work would be significant for the field. It directly tackles the inference latency barrier that has limited diffusion models in remote-sensing applications, while proposing a kernel-generation approach to capture global priors. The two-stage training and plug-in design are practical strengths that could facilitate adoption.

major comments (2)

[Method (low-rank core tensor generator and unified factor generator)] The low-rank core tensor generator is load-bearing for both the claimed global-prior capture and the 500× speedup. No derivation, approximation bound, or rank-selection analysis is provided showing that the chosen rank preserves the long-range spectral-spatial correlations required for accurate distribution mapping in remote-sensing data; if critical dependencies are discarded, the superior-performance claim would not hold even if runtime improves.
[Experiments and results] The abstract states superior performance and 500× faster inference, yet the provided description contains no quantitative tables, specific metrics (PSNR/SSIM), datasets, error bars, or statistical tests. Full experimental results with baseline comparisons and ablation tables are required to verify the central claims.

minor comments (1)

[Abstract and training strategy] Ensure all hyperparameters for the two-stage training and attention mechanism are explicitly listed for reproducibility; the promise to release code is noted positively.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We appreciate the recognition of the practical strengths of the two-stage training and plug-in design, as well as the potential significance if the performance and speedup claims are fully substantiated. We address each major comment below and outline the revisions planned for the next version of the manuscript.

read point-by-point responses

Referee: [Method (low-rank core tensor generator and unified factor generator)] The low-rank core tensor generator is load-bearing for both the claimed global-prior capture and the 500× speedup. No derivation, approximation bound, or rank-selection analysis is provided showing that the chosen rank preserves the long-range spectral-spatial correlations required for accurate distribution mapping in remote-sensing data; if critical dependencies are discarded, the superior-performance claim would not hold even if runtime improves.

Authors: We acknowledge that the manuscript currently lacks a formal derivation, approximation bound, or explicit rank-selection analysis for the low-rank core tensor generator. The design choices were guided by empirical ablations showing that the selected rank maintains competitive performance, but we agree this is insufficient to rigorously demonstrate preservation of long-range spectral-spatial correlations. In the revised manuscript, we will add a new subsection under the method description that includes (i) a brief tensor-decomposition perspective on why the chosen rank is expected to retain key global priors and (ii) additional quantitative analysis (e.g., correlation-preservation metrics and performance sensitivity curves across ranks) to support the claim that critical dependencies are not discarded. revision: yes
Referee: [Experiments and results] The abstract states superior performance and 500× faster inference, yet the provided description contains no quantitative tables, specific metrics (PSNR/SSIM), datasets, error bars, or statistical tests. Full experimental results with baseline comparisons and ablation tables are required to verify the central claims.

Authors: The full manuscript contains an Experiments section with quantitative tables reporting PSNR, SSIM, SAM, and ERGAS on standard remote-sensing datasets (e.g., WorldView-3, GaoFen-2), direct comparisons against recent pansharpening and diffusion baselines, and ablation studies. However, we recognize that the presentation may not have been sufficiently prominent or complete for verification. In the revision we will (i) expand the main results table to include error bars from multiple random seeds, (ii) add a statistical significance analysis (paired t-tests or Wilcoxon tests) for the reported improvements, and (iii) ensure every claim in the abstract is explicitly cross-referenced to the corresponding table or figure. We will also move key ablation results into the main paper if they were previously in the supplement. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on experiments, not self-referential definitions or fits

full rationale

The paper presents KSDiff as an architectural proposal that integrates a low-rank core tensor generator, unified factor generator, and structure-aware multi-head attention to produce convolutional kernels for kernel-space diffusion. Performance claims (superior results and >500× speedup) are explicitly tied to experimental validation, ablation studies, and comparisons against baselines rather than any derivation that reduces by construction to fitted parameters, self-citations, or renamed inputs. No equations or steps in the described method equate a 'prediction' to its own training data or invoke a uniqueness theorem from the authors' prior work as load-bearing justification. The two-stage training strategy is framed as an integration aid, not a circular prediction. This is a standard empirical ML contribution whose central assertions remain independently testable against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, no specific free parameters, axioms, or invented entities beyond the high-level method components can be identified; the two-stage training and kernel generators are presented as methodological choices.

pith-pipeline@v0.9.0 · 5742 in / 1134 out tokens · 53728 ms · 2026-05-22T01:18:14.308407+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

KSDiff constructs these kernels through the integration of a low-rank core tensor generator and a unified factor generator, orchestrated by a structure-aware multi-head attention mechanism... diffusion process in latent space
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce KSDiff, a fast kernel-space diffusion framework that generates convolutional kernels enriched with global context

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

76 extracted references · 76 canonical work pages · 8 internal anchors

[1]

Mtf-tailored multiscale fusion of high-resolution ms and pan imagery.Photogrammetric Engineering & Remote Sensing, 72(5):591–596, 2006

Bruno Aiazzi, Luciano Alparone, Stefano Baronti, Andrea Garzelli, and Massimo Selva. Mtf-tailored multiscale fusion of high-resolution ms and pan imagery.Photogrammetric Engineering & Remote Sensing, 72(5):591–596, 2006. 7, 8, 2, 3, 4

work page 2006
[2]

Stochastic Interpolants: A Unifying Framework for Flows and Diffusions

Michael S Albergo, Nicholas M Boffi, and Eric Vanden- Eijnden. Stochastic interpolants: A unifying framework for flows and diffusions.arXiv preprint arXiv:2303.08797, 2023. 3

work page internal anchor Pith review Pith/arXiv arXiv 2023
[3]

Full-resolution quality Table 2

Alberto Arienzo, Gemine Vivone, Andrea Garzelli, Luciano Alparone, and Jocelyn Chanussot. Full-resolution quality Table 2. Result on the GF2 reduced-resolution dataset. The best results are highlighted in bold and the second best results are under- lined. Method GaoFen-2SAM (±std) ERGAS (±std) Q2n(±std) SCC (±std) BDSD-PC [51]1.7110±0.0718 1.7025±0.0907 0...

work page arXiv 1943
[4]

Automating spectral unmixing of aviris data using convex geometry concepts

Joseph W Boardman. Automating spectral unmixing of aviris data using convex geometry concepts. InJPL, Summaries of the 4th Annual JPL Airborne Geoscience Workshop. Volume 1: AVIRIS Workshop, 1993. 6

work page 1993
[5]

Diffusion model with disentangled modulations for sharpening multispectral and hyperspectral images.Information Fusion, 104:102158, 2024

Zihan Cao, Shiqi Cao, Liang-Jian Deng, Xiao Wu, Junming Hou, and Gemine Vivone. Diffusion model with disentangled modulations for sharpening multispectral and hyperspectral images.Information Fusion, 104:102158, 2024. 2, 3, 5, 6

work page 2024
[6]

Diffusion Posterior Sampling for General Noisy Inverse Problems

Hyungjin Chung, Jeongsol Kim, Michael T Mccann, Marc L Klasky, and Jong Chul Ye. Diffusion posterior sam- pling for general noisy inverse problems.arXiv preprint arXiv:2209.14687, 2022. 3

work page internal anchor Pith review Pith/arXiv arXiv 2022
[7]

Diffusion schr ¨odinger bridge with applications to score-based generative modeling.Advances in Neural Information Processing Systems, 34:17695–17709, 2021

Valentin De Bortoli, James Thornton, Jeremy Heng, and Ar- naud Doucet. Diffusion schr ¨odinger bridge with applications to score-based generative modeling.Advances in Neural Information Processing Systems, 34:17695–17709, 2021. 3

work page 2021
[8]

Detail injection-based deep convolutional neural networks for pansharpening.IEEE Transactions on Geo- science and Remote Sensing, 59(8):6995–7010, 2020

Liang-Jian Deng, Gemine Vivone, Cheng Jin, and Jocelyn Chanussot. Detail injection-based deep convolutional neural networks for pansharpening.IEEE Transactions on Geo- science and Remote Sensing, 59(8):6995–7010, 2020. 2, 6, 7, 8, 9, 3, 4

work page 2020
[9]

Machine learning in pansharpening: A benchmark, from shallow to deep networks.IEEE Geo- science and Remote Sensing Magazine, 10(3):279–315, 2022

Liang-Jian Deng, Gemine Vivone, Mercedes E Paoletti, Giuseppe Scarpa, Jiang He, Yongjun Zhang, Jocelyn Chanus- sot, and Antonio Plaza. Machine learning in pansharpening: A benchmark, from shallow to deep networks.IEEE Geo- science and Remote Sensing Magazine, 10(3):279–315, 2022. 6, 2

work page 2022
[10]

Content-adaptive non-local convolution for remote sensing pansharpening

Yule Duan, Xiao Wu, Haoyu Deng, and Liang-Jian Deng. Content-adaptive non-local convolution for remote sensing pansharpening. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 27738– 27747, 2024. 2

work page 2024
[11]

Scaling rectified flow trans- formers for high-resolution image synthesis

Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim En- tezari, Jonas M¨uller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. Scaling rectified flow trans- formers for high-resolution image synthesis. InForty-first international conference on machine learning, 2024. 3

work page 2024
[12]

Hypercomplex quality assessment of multi/hyperspectral images.IEEE Geoscience and Remote Sensing Letters, 6(4):662–665, 2009

Andrea Garzelli and Filippo Nencini. Hypercomplex quality assessment of multi/hyperspectral images.IEEE Geoscience and Remote Sensing Letters, 6(4):662–665, 2009. 6

work page 2009
[13]

Hqg-net: Unpaired medical image enhancement with high-quality guid- ance.IEEE Transactions on Neural Networks and Learning Systems, 2023

Chunming He, Kai Li, Guoxia Xu, Jiangpeng Yan, Longxi- ang Tang, Yulun Zhang, Yaowei Wang, and Xiu Li. Hqg-net: Unpaired medical image enhancement with high-quality guid- ance.IEEE Transactions on Neural Networks and Learning Systems, 2023. 6

work page 2023
[14]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceed- ings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 1, 3

work page 2016
[15]

Pansharpening via detail injec- tion based convolutional neural networks.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(4):1188–1204, 2019

Lin He, Yizhou Rao, Jun Li, Jocelyn Chanussot, Antonio Plaza, Jiawei Zhu, and Bo Li. Pansharpening via detail injec- tion based convolutional neural networks.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(4):1188–1204, 2019. 2, 7, 8, 9, 3, 4

work page 2019
[16]

Pan- mamba: Effective pan-sharpening with state space model

Xuanhua He, Ke Cao, Jie Zhang, Keyu Yan, Yingying Wang, Rui Li, Chengjun Xie, Danfeng Hong, and Man Zhou. Pan- mamba: Effective pan-sharpening with state space model. Information Fusion, 115:102779, 2025. 2, 7, 8, 3, 4

work page 2025
[17]

Gaussian Error Linear Units (GELUs)

Dan Hendrycks and Kevin Gimpel. Gaussian error linear units (gelus).arXiv preprint arXiv:1606.08415, 2016. 3

work page internal anchor Pith review Pith/arXiv arXiv 2016
[18]

Denoising diffu- sion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffu- sion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020. 2, 3, 5, 6

work page 2020
[19]

LoRA: Low-rank adaptation of large language models

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen- Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022. 8

work page 2022
[20]

Lagconv: Local-context adap- tive convolution kernels with global harmonic bias for pan- sharpening

Zi-Rong Jin, Tian-Jing Zhang, Tai-Xiang Jiang, Gemine Vivone, and Liang-Jian Deng. Lagconv: Local-context adap- tive convolution kernels with global harmonic bias for pan- sharpening. InProceedings of the AAAI conference on ar- tificial intelligence, pages 1113–1121, 2022. 2, 7, 8, 9, 3, 4

work page 2022
[21]

Elucidating the design space of diffusion-based generative models.Advances in neural information processing systems, 35:26565–26577, 2022

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models.Advances in neural information processing systems, 35:26565–26577, 2022. 2, 3, 5

work page 2022
[22]

Transformers are rnns: Fast autoregressive transformers with linear attention

Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, and Franc ¸ois Fleuret. Transformers are rnns: Fast autoregressive transformers with linear attention. InInternational conference on machine learning, pages 5156–5165. PMLR, 2020. 4

work page 2020
[23]

Auto-encoding variational bayes, 2013

Diederik P Kingma, Max Welling, et al. Auto-encoding variational bayes, 2013. 6

work page 2013
[24]

Tensor decompositions and applications.SIAM review, 51(3):455–500, 2009

Tamara G Kolda and Brett W Bader. Tensor decompositions and applications.SIAM review, 51(3):455–500, 2009. 4

work page 2009
[25]

Extracting spectral contrast in landsat thematic mapper image data using selective principal component analysis.Photogramm

P Kwarteng and A Chavez. Extracting spectral contrast in landsat thematic mapper image data using selective principal component analysis.Photogramm. Eng. Remote Sens, 55(1): 339–348, 1989. 1 9

work page 1989
[26]

Diffusion models for image restoration and enhancement: a comprehensive sur- vey.International Journal of Computer Vision, pages 1–31,

Xin Li, Yulin Ren, Xin Jin, Cuiling Lan, Xingrui Wang, Wen- jun Zeng, Xinchao Wang, and Zhibo Chen. Diffusion models for image restoration and enhancement: a comprehensive sur- vey.International Journal of Computer Vision, pages 1–31,

work page
[27]

Pmac- net: Parallel multiscale attention constraint network for pan- sharpening.IEEE Geoscience and Remote Sensing Letters, 19:1–5, 2022

Yixun Liang, Ping Zhang, Yang Mei, and Tingqi Wang. Pmac- net: Parallel multiscale attention constraint network for pan- sharpening.IEEE Geoscience and Remote Sensing Letters, 19:1–5, 2022. 2

work page 2022
[28]

Flow Matching for Generative Modeling

Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow matching for generative modeling. arXiv preprint arXiv:2210.02747, 2022. 3, 5

work page internal anchor Pith review Pith/arXiv arXiv 2022
[29]

Residual denoising diffusion models

Jiawei Liu, Qiang Wang, Huijie Fan, Yinong Wang, Yan- dong Tang, and Liangqiong Qu. Residual denoising diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2773–2783,

work page
[30]

Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

Xingchao Liu, Chengyue Gong, and Qiang Liu. Flow straight and fast: Learning to generate and transfer data with rectified flow.arXiv preprint arXiv:2209.03003, 2022. 3

work page internal anchor Pith review Pith/arXiv arXiv 2022
[31]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017. 6, 3

work page internal anchor Pith review Pith/arXiv arXiv 2017
[32]

Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps.Advances in Neural Information Processing Systems, 35:5775–5787, 2022

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps.Advances in Neural Information Processing Systems, 35:5775–5787, 2022. 5

work page 2022
[33]

Pansharpening by convolutional neural networks.Remote Sensing, 8(7):594, 2016

Giuseppe Masi, Davide Cozzolino, Luisa Verdoliva, and Giuseppe Scarpa. Pansharpening by convolutional neural networks.Remote Sensing, 8(7):594, 2016. 2, 7, 8, 3, 4

work page 2016
[34]

Pan- diff: A novel pansharpening method based on denoising diffu- sion probabilistic model.IEEE Transactions on Geoscience and Remote Sensing, 61:1–17, 2023

Qingyan Meng, Wenxu Shi, Sijia Li, and Linlin Zhang. Pan- diff: A novel pansharpening method based on denoising diffu- sion probabilistic model.IEEE Transactions on Geoscience and Remote Sensing, 61:1–17, 2023. 2, 3, 7, 8, 4

work page 2023
[35]

Pansharpening with a guided filter based on three-layer decomposition.Sensors, 16(7):1068, 2016

Xiangchao Meng, Jie Li, Huanfeng Shen, Liangpei Zhang, and Hongyan Zhang. Pansharpening with a guided filter based on three-layer decomposition.Sensors, 16(7):1068, 2016. 1

work page 2016
[36]

Improved denoising diffusion probabilistic models

Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. InInternational conference on machine learning, pages 8162–8171. PMLR,

work page
[37]

Introduction of sensor spectral response into image fusion methods

Xavier Otazu, Mar´ıa Gonz´alez-Aud´ıcana, Octavi Fors, and Jorge N´u˜nez. Introduction of sensor spectral response into image fusion methods. application to wavelet-based methods. IEEE Transactions on Geoscience and Remote Sensing, 43 (10):2376–2385, 2005. 1

work page 2005
[38]

Source-adaptive discriminative kernels based network for remote sensing pansharpening

Siran Peng, Liang-Jian Deng, Jin-Fan Hu, and Yu-Wei Zhuo. Source-adaptive discriminative kernels based network for remote sensing pansharpening. InIJCAI, pages 1283–1289,

work page
[39]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022. 2, 3

work page 2022
[40]

U- net: Convolutional networks for biomedical image segmen- tation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- net: Convolutional networks for biomedical image segmen- tation. InMedical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, pages 234–241. Springer, 2015. 5, 1, 3

work page 2015
[41]

Unsupervised hyperspectral pansharp- ening via low-rank diffusion model.Information Fusion, 107: 102325, 2024

Xiangyu Rui, Xiangyong Cao, Li Pang, Zeyu Zhu, Zongsheng Yue, and Deyu Meng. Unsupervised hyperspectral pansharp- ening via low-rank diffusion model.Information Fusion, 107: 102325, 2024. 3, 7, 8, 2, 4

work page 2024
[42]

Image super- resolution via iterative refinement.IEEE transactions on pattern analysis and machine intelligence, 45(4):4713–4726,

Chitwan Saharia, Jonathan Ho, William Chan, Tim Sali- mans, David J Fleet, and Mohammad Norouzi. Image super- resolution via iterative refinement.IEEE transactions on pattern analysis and machine intelligence, 45(4):4713–4726,

work page
[43]

Bespoke solvers for generative flow models.arXiv preprint arXiv:2310.19075, 2023

Neta Shaul, Juan Perez, Ricky TQ Chen, Ali Thabet, Albert Pumarola, and Yaron Lipman. Bespoke solvers for generative flow models.arXiv preprint arXiv:2310.19075, 2023. 3

work page arXiv 2023
[44]

Efficient attention: Attention with linear com- plexities

Zhuoran Shen, Mingyuan Zhang, Haiyu Zhao, Shuai Yi, and Hongsheng Li. Efficient attention: Attention with linear com- plexities. InProceedings of the IEEE/CVF winter conference on applications of computer vision, pages 3531–3539, 2021. 4

work page 2021
[45]

Denoising Diffusion Implicit Models

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models.arXiv preprint arXiv:2010.02502,

work page internal anchor Pith review Pith/arXiv arXiv 2010
[46]

Score-Based Generative Modeling through Stochastic Differential Equations

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Ab- hishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020. 3

work page internal anchor Pith review Pith/arXiv arXiv 2011
[47]

Consistency models

Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Consistency models. 2023. 3

work page 2023
[48]

Revisiting spatial- frequency information integration from a hierarchical per- spective for panchromatic and multi-spectral image fusion

Jiangtong Tan, Jie Huang, Naishan Zheng, Man Zhou, Keyu Yan, Danfeng Hong, and Feng Zhao. Revisiting spatial- frequency information integration from a hierarchical per- spective for panchromatic and multi-spectral image fusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 25922–25931, 2024. 3

work page 2024
[49]

Some mathematical notes on three-mode factor analysis.Psychometrika, 31(3):279–311, 1966

Ledyard R Tucker. Some mathematical notes on three-mode factor analysis.Psychometrika, 31(3):279–311, 1966. 4

work page 1966
[50]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017. 4

work page 2017
[51]

Robust band-dependent spatial-detail ap- proaches for panchromatic sharpening.IEEE transactions on Geoscience and Remote Sensing, 57(9):6421–6433, 2019

Gemine Vivone. Robust band-dependent spatial-detail ap- proaches for panchromatic sharpening.IEEE transactions on Geoscience and Remote Sensing, 57(9):6421–6433, 2019. 7, 8, 2, 3, 4

work page 2019
[52]

A regression-based high-pass modulation pansharpening ap- proach.IEEE Transactions on geoscience and remote sensing, 56(2):984–996, 2017

Gemine Vivone, Rocco Restaino, and Jocelyn Chanussot. A regression-based high-pass modulation pansharpening ap- proach.IEEE Transactions on geoscience and remote sensing, 56(2):984–996, 2017. 1

work page 2017
[53]

Full scale regression-based injection coefficients for panchromatic sharpening.IEEE Transactions on Image Processing, 27(7): 3418–3431, 2018

Gemine Vivone, Rocco Restaino, and Jocelyn Chanussot. Full scale regression-based injection coefficients for panchromatic sharpening.IEEE Transactions on Image Processing, 27(7): 3418–3431, 2018. 7, 8, 2, 3, 4

work page 2018
[54]

Presses des MINES, 2002

Lucien Wald.Data fusion: definitions and architectures: fusion of images of different spatial resolutions. Presses des MINES, 2002. 6 10

work page 2002
[55]

Fusion of satellite images of different spatial resolutions: Assessing the quality of resulting images.Photogrammetric engineering and remote sensing, 63(6):691–699, 1997

Lucien Wald, Thierry Ranchin, and Marc Mangolini. Fusion of satellite images of different spatial resolutions: Assessing the quality of resulting images.Photogrammetric engineering and remote sensing, 63(6):691–699, 1997. 6

work page 1997
[56]

Exploiting diffusion prior for real-world image super-resolution.International Journal of Computer Vision, 132(12):5929–5949, 2024

Jianyi Wang, Zongsheng Yue, Shangchen Zhou, Kelvin CK Chan, and Chen Change Loy. Exploiting diffusion prior for real-world image super-resolution.International Journal of Computer Vision, 132(12):5929–5949, 2024. 3

work page 2024
[57]

Neural network diffusion

Kai Wang, Dongwen Tang, Boya Zeng, Yida Yin, Zhaopan Xu, Yukun Zhou, Zelin Zang, Trevor Darrell, Zhuang Liu, and Yang You. Neural network diffusion.arXiv preprint arXiv:2402.13144, 2024. 3

work page arXiv 2024
[58]

Multi-scale- and-depth convolutional neural network for remote sensed imagery pan-sharpening

Yancong Wei, Qiangqiang Yuan, Xiangchao Meng, Huan- feng Shen, Liangpei Zhang, and Michael Ng. Multi-scale- and-depth convolutional neural network for remote sensed imagery pan-sharpening. In2017 IEEE International Geo- science and Remote Sensing Symposium (IGARSS), pages 3413–3416. IEEE, 2017. 7, 8, 2, 3, 4

work page 2017
[59]

A post- classification change detection method based on iterative slow feature analysis and bayesian soft fusion.Remote Sensing of Environment, 199:241–255, 2017

Chen Wu, Bo Du, Xiaohui Cui, and Liangpei Zhang. A post- classification change detection method based on iterative slow feature analysis and bayesian soft fusion.Remote Sensing of Environment, 199:241–255, 2017. 1

work page 2017
[60]

Dynamic cross feature fusion for remote sensing pan- sharpening

Xiao Wu, Ting-Zhu Huang, Liang-Jian Deng, and Tian-Jing Zhang. Dynamic cross feature fusion for remote sensing pan- sharpening. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 14687–14696, 2021. 2, 7, 8, 3, 4

work page 2021
[61]

Lrtcfpan: Low-rank tensor completion based framework for pansharpen- ing.IEEE Transactions on Image Processing, 32:1640–1655,

Zhong-Cheng Wu, Ting-Zhu Huang, Liang-Jian Deng, Jie Huang, Jocelyn Chanussot, and Gemine Vivone. Lrtcfpan: Low-rank tensor completion based framework for pansharpen- ing.IEEE Transactions on Image Processing, 32:1640–1655,

work page
[62]

A framelet sparse reconstruction method for pansharpening with guaranteed convergence.Inverse Problems and Imaging, 17(6):1277–1300, 2023

Zhong-Cheng Wu, Ting-Zhu Huang, Liang-Jian Deng, and Gemine Vivone. A framelet sparse reconstruction method for pansharpening with guaranteed convergence.Inverse Problems and Imaging, 17(6):1277–1300, 2023. 1

work page 2023
[63]

Diffir: Efficient diffusion model for image restoration

Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, and Luc Van Gool. Diffir: Efficient diffusion model for image restoration. InProceed- ings of the IEEE/CVF International Conference on Computer Vision, pages 13095–13105, 2023. 2, 3, 6

work page 2023
[64]

Hyperspectral pansharpening via diffusion models with iteratively zero-shot guidance

Jin-Liang Xiao, Ting-Zhu Huang, Liang-Jian Deng, Guang Lin, Zihan Cao, Chao Li, and Qibin Zhao. Hyperspectral pansharpening via diffusion models with iteratively zero-shot guidance. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12669–12678, 2025. 3

work page 2025
[65]

Pannet: A deep network architecture for pan-sharpening

Junfeng Yang, Xueyang Fu, Yuwen Hu, Yue Huang, Xinghao Ding, and John Paisley. Pannet: A deep network architecture for pan-sharpening. InProceedings of the IEEE international conference on computer vision, pages 5449–5457, 2017. 2

work page 2017
[66]

Deep learning for single image super-resolution: A brief review.IEEE Transactions on Multimedia, 21(12):3106–3121, 2019

Wenming Yang, Xuechen Zhang, Yapeng Tian, Wei Wang, Jing-Hao Xue, and Qingmin Liao. Deep learning for single image super-resolution: A brief review.IEEE Transactions on Multimedia, 21(12):3106–3121, 2019. 5

work page 2019
[67]

A review of deep learning methods for semantic segmentation of remote sensing imagery.Expert Systems with Applications, 169: 114417, 2021

Xiaohui Yuan, Jianfang Shi, and Lichuan Gu. A review of deep learning methods for semantic segmentation of remote sensing imagery.Expert Systems with Applications, 169: 114417, 2021. 1

work page 2021
[68]

Resshift: Efficient diffusion model for image super-resolution by resid- ual shifting.Advances in Neural Information Processing Systems, 36:13294–13307, 2023

Zongsheng Yue, Jianyi Wang, and Chen Change Loy. Resshift: Efficient diffusion model for image super-resolution by resid- ual shifting.Advances in Neural Information Processing Systems, 36:13294–13307, 2023. 5

work page 2023
[69]

Dpm- solver-v3: Improved diffusion ode solver with empirical model statistics.Advances in Neural Information Processing Systems, 36:55502–55542, 2023

Kaiwen Zheng, Cheng Lu, Jianfei Chen, and Jun Zhu. Dpm- solver-v3: Improved diffusion ode solver with empirical model statistics.Advances in Neural Information Processing Systems, 36:55502–55542, 2023. 3, 5

work page 2023
[70]

A wavelet transform method to merge landsat tm and spot panchromatic data.International journal of remote sensing, 19(4):743–757,

Jie Zhou, Daniel L Civco, and John A Silander. A wavelet transform method to merge landsat tm and spot panchromatic data.International journal of remote sensing, 19(4):743–757,

work page
[71]

Denoising diffusion bridge models.arXiv preprint arXiv:2309.16948, 2023

Linqi Zhou, Aaron Lou, Samar Khanna, and Stefano Er- mon. Denoising diffusion bridge models.arXiv preprint arXiv:2309.16948, 2023. 3

work page arXiv 2023
[72]

Pan-sharpening with customized transformer and invert- ible neural network

Man Zhou, Jie Huang, Yanchi Fang, Xueyang Fu, and Aiping Liu. Pan-sharpening with customized transformer and invert- ible neural network. InProceedings of the AAAI conference on artificial intelligence, pages 3553–3561, 2022. 2, 7, 8, 3, 4

work page 2022
[73]

Memory-augmented deep unfolding network for guided image super-resolution.International Journal of Computer Vision, 131(1):215–242, 2023

Man Zhou, Keyu Yan, Jinshan Pan, Wenqi Ren, Qi Xie, and Xiangyong Cao. Memory-augmented deep unfolding network for guided image super-resolution.International Journal of Computer Vision, 131(1):215–242, 2023. 7, 8, 2, 3, 4 11 Fast Kernel-Space Diffusion for Remote Sensing Pansharpening Supplementary Material Abstract In this supplementary material, we fir...

work page 2023
[74]

Pansharpening Network The baseline network we build is demonstrated in Fig

Methods Explanation 6.1. Pansharpening Network The baseline network we build is demonstrated in Fig. 7. The panchromatic (PAN) imageP∈R H×W×1 is first dupli- cated along the channel dimension to match the number of channels in the low-resolution multispectral (LRMS) image M∈R H×W×C . The duplicated PAN image is then sub- tracted by the LRMS image, and the...

work page
[75]

Datasets We conducted experiments using datasets derived from WorldView-3 (WV3), QuickBird (QB), and GaoFen-2 (GF2) satellite imagery

Details on Experiments 7.1. Datasets We conducted experiments using datasets derived from WorldView-3 (WV3), QuickBird (QB), and GaoFen-2 (GF2) satellite imagery. These datasets consist of image patches cropped from full remote sensing scenes and are partitioned into training and testing subsets. The WV3 dataset contains four images from two geographic lo...

work page arXiv 2019
[76]

Main Results Tab

Additional Results 8.1. Main Results Tab. 8 and Tab. 9 present the quantitative performance bench- marks on the full-resolution GF2 and QB datasets. The results indicate that the proposed KSDiff method exhibits strong generalization capabilities across different data do- mains. Fig. 9 to Fig. 14 provide qualitative comparisons of visual outputs generated ...

work page arXiv 1923

[1] [1]

Mtf-tailored multiscale fusion of high-resolution ms and pan imagery.Photogrammetric Engineering & Remote Sensing, 72(5):591–596, 2006

Bruno Aiazzi, Luciano Alparone, Stefano Baronti, Andrea Garzelli, and Massimo Selva. Mtf-tailored multiscale fusion of high-resolution ms and pan imagery.Photogrammetric Engineering & Remote Sensing, 72(5):591–596, 2006. 7, 8, 2, 3, 4

work page 2006

[2] [2]

Stochastic Interpolants: A Unifying Framework for Flows and Diffusions

Michael S Albergo, Nicholas M Boffi, and Eric Vanden- Eijnden. Stochastic interpolants: A unifying framework for flows and diffusions.arXiv preprint arXiv:2303.08797, 2023. 3

work page internal anchor Pith review Pith/arXiv arXiv 2023

[3] [3]

Full-resolution quality Table 2

Alberto Arienzo, Gemine Vivone, Andrea Garzelli, Luciano Alparone, and Jocelyn Chanussot. Full-resolution quality Table 2. Result on the GF2 reduced-resolution dataset. The best results are highlighted in bold and the second best results are under- lined. Method GaoFen-2SAM (±std) ERGAS (±std) Q2n(±std) SCC (±std) BDSD-PC [51]1.7110±0.0718 1.7025±0.0907 0...

work page arXiv 1943

[4] [4]

Automating spectral unmixing of aviris data using convex geometry concepts

Joseph W Boardman. Automating spectral unmixing of aviris data using convex geometry concepts. InJPL, Summaries of the 4th Annual JPL Airborne Geoscience Workshop. Volume 1: AVIRIS Workshop, 1993. 6

work page 1993

[5] [5]

Diffusion model with disentangled modulations for sharpening multispectral and hyperspectral images.Information Fusion, 104:102158, 2024

Zihan Cao, Shiqi Cao, Liang-Jian Deng, Xiao Wu, Junming Hou, and Gemine Vivone. Diffusion model with disentangled modulations for sharpening multispectral and hyperspectral images.Information Fusion, 104:102158, 2024. 2, 3, 5, 6

work page 2024

[6] [6]

Diffusion Posterior Sampling for General Noisy Inverse Problems

Hyungjin Chung, Jeongsol Kim, Michael T Mccann, Marc L Klasky, and Jong Chul Ye. Diffusion posterior sam- pling for general noisy inverse problems.arXiv preprint arXiv:2209.14687, 2022. 3

work page internal anchor Pith review Pith/arXiv arXiv 2022

[7] [7]

Diffusion schr ¨odinger bridge with applications to score-based generative modeling.Advances in Neural Information Processing Systems, 34:17695–17709, 2021

Valentin De Bortoli, James Thornton, Jeremy Heng, and Ar- naud Doucet. Diffusion schr ¨odinger bridge with applications to score-based generative modeling.Advances in Neural Information Processing Systems, 34:17695–17709, 2021. 3

work page 2021

[8] [8]

Detail injection-based deep convolutional neural networks for pansharpening.IEEE Transactions on Geo- science and Remote Sensing, 59(8):6995–7010, 2020

Liang-Jian Deng, Gemine Vivone, Cheng Jin, and Jocelyn Chanussot. Detail injection-based deep convolutional neural networks for pansharpening.IEEE Transactions on Geo- science and Remote Sensing, 59(8):6995–7010, 2020. 2, 6, 7, 8, 9, 3, 4

work page 2020

[9] [9]

Machine learning in pansharpening: A benchmark, from shallow to deep networks.IEEE Geo- science and Remote Sensing Magazine, 10(3):279–315, 2022

Liang-Jian Deng, Gemine Vivone, Mercedes E Paoletti, Giuseppe Scarpa, Jiang He, Yongjun Zhang, Jocelyn Chanus- sot, and Antonio Plaza. Machine learning in pansharpening: A benchmark, from shallow to deep networks.IEEE Geo- science and Remote Sensing Magazine, 10(3):279–315, 2022. 6, 2

work page 2022

[10] [10]

Content-adaptive non-local convolution for remote sensing pansharpening

Yule Duan, Xiao Wu, Haoyu Deng, and Liang-Jian Deng. Content-adaptive non-local convolution for remote sensing pansharpening. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 27738– 27747, 2024. 2

work page 2024

[11] [11]

Scaling rectified flow trans- formers for high-resolution image synthesis

Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim En- tezari, Jonas M¨uller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. Scaling rectified flow trans- formers for high-resolution image synthesis. InForty-first international conference on machine learning, 2024. 3

work page 2024

[12] [12]

Hypercomplex quality assessment of multi/hyperspectral images.IEEE Geoscience and Remote Sensing Letters, 6(4):662–665, 2009

Andrea Garzelli and Filippo Nencini. Hypercomplex quality assessment of multi/hyperspectral images.IEEE Geoscience and Remote Sensing Letters, 6(4):662–665, 2009. 6

work page 2009

[13] [13]

Hqg-net: Unpaired medical image enhancement with high-quality guid- ance.IEEE Transactions on Neural Networks and Learning Systems, 2023

Chunming He, Kai Li, Guoxia Xu, Jiangpeng Yan, Longxi- ang Tang, Yulun Zhang, Yaowei Wang, and Xiu Li. Hqg-net: Unpaired medical image enhancement with high-quality guid- ance.IEEE Transactions on Neural Networks and Learning Systems, 2023. 6

work page 2023

[14] [14]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceed- ings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 1, 3

work page 2016

[15] [15]

Pansharpening via detail injec- tion based convolutional neural networks.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(4):1188–1204, 2019

Lin He, Yizhou Rao, Jun Li, Jocelyn Chanussot, Antonio Plaza, Jiawei Zhu, and Bo Li. Pansharpening via detail injec- tion based convolutional neural networks.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(4):1188–1204, 2019. 2, 7, 8, 9, 3, 4

work page 2019

[16] [16]

Pan- mamba: Effective pan-sharpening with state space model

Xuanhua He, Ke Cao, Jie Zhang, Keyu Yan, Yingying Wang, Rui Li, Chengjun Xie, Danfeng Hong, and Man Zhou. Pan- mamba: Effective pan-sharpening with state space model. Information Fusion, 115:102779, 2025. 2, 7, 8, 3, 4

work page 2025

[17] [17]

Gaussian Error Linear Units (GELUs)

Dan Hendrycks and Kevin Gimpel. Gaussian error linear units (gelus).arXiv preprint arXiv:1606.08415, 2016. 3

work page internal anchor Pith review Pith/arXiv arXiv 2016

[18] [18]

Denoising diffu- sion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffu- sion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020. 2, 3, 5, 6

work page 2020

[19] [19]

LoRA: Low-rank adaptation of large language models

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen- Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022. 8

work page 2022

[20] [20]

Lagconv: Local-context adap- tive convolution kernels with global harmonic bias for pan- sharpening

Zi-Rong Jin, Tian-Jing Zhang, Tai-Xiang Jiang, Gemine Vivone, and Liang-Jian Deng. Lagconv: Local-context adap- tive convolution kernels with global harmonic bias for pan- sharpening. InProceedings of the AAAI conference on ar- tificial intelligence, pages 1113–1121, 2022. 2, 7, 8, 9, 3, 4

work page 2022

[21] [21]

Elucidating the design space of diffusion-based generative models.Advances in neural information processing systems, 35:26565–26577, 2022

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models.Advances in neural information processing systems, 35:26565–26577, 2022. 2, 3, 5

work page 2022

[22] [22]

Transformers are rnns: Fast autoregressive transformers with linear attention

Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, and Franc ¸ois Fleuret. Transformers are rnns: Fast autoregressive transformers with linear attention. InInternational conference on machine learning, pages 5156–5165. PMLR, 2020. 4

work page 2020

[23] [23]

Auto-encoding variational bayes, 2013

Diederik P Kingma, Max Welling, et al. Auto-encoding variational bayes, 2013. 6

work page 2013

[24] [24]

Tensor decompositions and applications.SIAM review, 51(3):455–500, 2009

Tamara G Kolda and Brett W Bader. Tensor decompositions and applications.SIAM review, 51(3):455–500, 2009. 4

work page 2009

[25] [25]

Extracting spectral contrast in landsat thematic mapper image data using selective principal component analysis.Photogramm

P Kwarteng and A Chavez. Extracting spectral contrast in landsat thematic mapper image data using selective principal component analysis.Photogramm. Eng. Remote Sens, 55(1): 339–348, 1989. 1 9

work page 1989

[26] [26]

Diffusion models for image restoration and enhancement: a comprehensive sur- vey.International Journal of Computer Vision, pages 1–31,

Xin Li, Yulin Ren, Xin Jin, Cuiling Lan, Xingrui Wang, Wen- jun Zeng, Xinchao Wang, and Zhibo Chen. Diffusion models for image restoration and enhancement: a comprehensive sur- vey.International Journal of Computer Vision, pages 1–31,

work page

[27] [27]

Pmac- net: Parallel multiscale attention constraint network for pan- sharpening.IEEE Geoscience and Remote Sensing Letters, 19:1–5, 2022

Yixun Liang, Ping Zhang, Yang Mei, and Tingqi Wang. Pmac- net: Parallel multiscale attention constraint network for pan- sharpening.IEEE Geoscience and Remote Sensing Letters, 19:1–5, 2022. 2

work page 2022

[28] [28]

Flow Matching for Generative Modeling

Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow matching for generative modeling. arXiv preprint arXiv:2210.02747, 2022. 3, 5

work page internal anchor Pith review Pith/arXiv arXiv 2022

[29] [29]

Residual denoising diffusion models

Jiawei Liu, Qiang Wang, Huijie Fan, Yinong Wang, Yan- dong Tang, and Liangqiong Qu. Residual denoising diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2773–2783,

work page

[30] [30]

Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

Xingchao Liu, Chengyue Gong, and Qiang Liu. Flow straight and fast: Learning to generate and transfer data with rectified flow.arXiv preprint arXiv:2209.03003, 2022. 3

work page internal anchor Pith review Pith/arXiv arXiv 2022

[31] [31]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017. 6, 3

work page internal anchor Pith review Pith/arXiv arXiv 2017

[32] [32]

Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps.Advances in Neural Information Processing Systems, 35:5775–5787, 2022

Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps.Advances in Neural Information Processing Systems, 35:5775–5787, 2022. 5

work page 2022

[33] [33]

Pansharpening by convolutional neural networks.Remote Sensing, 8(7):594, 2016

Giuseppe Masi, Davide Cozzolino, Luisa Verdoliva, and Giuseppe Scarpa. Pansharpening by convolutional neural networks.Remote Sensing, 8(7):594, 2016. 2, 7, 8, 3, 4

work page 2016

[34] [34]

Pan- diff: A novel pansharpening method based on denoising diffu- sion probabilistic model.IEEE Transactions on Geoscience and Remote Sensing, 61:1–17, 2023

Qingyan Meng, Wenxu Shi, Sijia Li, and Linlin Zhang. Pan- diff: A novel pansharpening method based on denoising diffu- sion probabilistic model.IEEE Transactions on Geoscience and Remote Sensing, 61:1–17, 2023. 2, 3, 7, 8, 4

work page 2023

[35] [35]

Pansharpening with a guided filter based on three-layer decomposition.Sensors, 16(7):1068, 2016

Xiangchao Meng, Jie Li, Huanfeng Shen, Liangpei Zhang, and Hongyan Zhang. Pansharpening with a guided filter based on three-layer decomposition.Sensors, 16(7):1068, 2016. 1

work page 2016

[36] [36]

Improved denoising diffusion probabilistic models

Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. InInternational conference on machine learning, pages 8162–8171. PMLR,

work page

[37] [37]

Introduction of sensor spectral response into image fusion methods

Xavier Otazu, Mar´ıa Gonz´alez-Aud´ıcana, Octavi Fors, and Jorge N´u˜nez. Introduction of sensor spectral response into image fusion methods. application to wavelet-based methods. IEEE Transactions on Geoscience and Remote Sensing, 43 (10):2376–2385, 2005. 1

work page 2005

[38] [38]

Source-adaptive discriminative kernels based network for remote sensing pansharpening

Siran Peng, Liang-Jian Deng, Jin-Fan Hu, and Yu-Wei Zhuo. Source-adaptive discriminative kernels based network for remote sensing pansharpening. InIJCAI, pages 1283–1289,

work page

[39] [39]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022. 2, 3

work page 2022

[40] [40]

U- net: Convolutional networks for biomedical image segmen- tation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U- net: Convolutional networks for biomedical image segmen- tation. InMedical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, pages 234–241. Springer, 2015. 5, 1, 3

work page 2015

[41] [41]

Unsupervised hyperspectral pansharp- ening via low-rank diffusion model.Information Fusion, 107: 102325, 2024

Xiangyu Rui, Xiangyong Cao, Li Pang, Zeyu Zhu, Zongsheng Yue, and Deyu Meng. Unsupervised hyperspectral pansharp- ening via low-rank diffusion model.Information Fusion, 107: 102325, 2024. 3, 7, 8, 2, 4

work page 2024

[42] [42]

Image super- resolution via iterative refinement.IEEE transactions on pattern analysis and machine intelligence, 45(4):4713–4726,

Chitwan Saharia, Jonathan Ho, William Chan, Tim Sali- mans, David J Fleet, and Mohammad Norouzi. Image super- resolution via iterative refinement.IEEE transactions on pattern analysis and machine intelligence, 45(4):4713–4726,

work page

[43] [43]

Bespoke solvers for generative flow models.arXiv preprint arXiv:2310.19075, 2023

Neta Shaul, Juan Perez, Ricky TQ Chen, Ali Thabet, Albert Pumarola, and Yaron Lipman. Bespoke solvers for generative flow models.arXiv preprint arXiv:2310.19075, 2023. 3

work page arXiv 2023

[44] [44]

Efficient attention: Attention with linear com- plexities

Zhuoran Shen, Mingyuan Zhang, Haiyu Zhao, Shuai Yi, and Hongsheng Li. Efficient attention: Attention with linear com- plexities. InProceedings of the IEEE/CVF winter conference on applications of computer vision, pages 3531–3539, 2021. 4

work page 2021

[45] [45]

Denoising Diffusion Implicit Models

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models.arXiv preprint arXiv:2010.02502,

work page internal anchor Pith review Pith/arXiv arXiv 2010

[46] [46]

Score-Based Generative Modeling through Stochastic Differential Equations

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Ab- hishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020. 3

work page internal anchor Pith review Pith/arXiv arXiv 2011

[47] [47]

Consistency models

Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Consistency models. 2023. 3

work page 2023

[48] [48]

Revisiting spatial- frequency information integration from a hierarchical per- spective for panchromatic and multi-spectral image fusion

Jiangtong Tan, Jie Huang, Naishan Zheng, Man Zhou, Keyu Yan, Danfeng Hong, and Feng Zhao. Revisiting spatial- frequency information integration from a hierarchical per- spective for panchromatic and multi-spectral image fusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 25922–25931, 2024. 3

work page 2024

[49] [49]

Some mathematical notes on three-mode factor analysis.Psychometrika, 31(3):279–311, 1966

Ledyard R Tucker. Some mathematical notes on three-mode factor analysis.Psychometrika, 31(3):279–311, 1966. 4

work page 1966

[50] [50]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017. 4

work page 2017

[51] [51]

Robust band-dependent spatial-detail ap- proaches for panchromatic sharpening.IEEE transactions on Geoscience and Remote Sensing, 57(9):6421–6433, 2019

Gemine Vivone. Robust band-dependent spatial-detail ap- proaches for panchromatic sharpening.IEEE transactions on Geoscience and Remote Sensing, 57(9):6421–6433, 2019. 7, 8, 2, 3, 4

work page 2019

[52] [52]

A regression-based high-pass modulation pansharpening ap- proach.IEEE Transactions on geoscience and remote sensing, 56(2):984–996, 2017

Gemine Vivone, Rocco Restaino, and Jocelyn Chanussot. A regression-based high-pass modulation pansharpening ap- proach.IEEE Transactions on geoscience and remote sensing, 56(2):984–996, 2017. 1

work page 2017

[53] [53]

Full scale regression-based injection coefficients for panchromatic sharpening.IEEE Transactions on Image Processing, 27(7): 3418–3431, 2018

Gemine Vivone, Rocco Restaino, and Jocelyn Chanussot. Full scale regression-based injection coefficients for panchromatic sharpening.IEEE Transactions on Image Processing, 27(7): 3418–3431, 2018. 7, 8, 2, 3, 4

work page 2018

[54] [54]

Presses des MINES, 2002

Lucien Wald.Data fusion: definitions and architectures: fusion of images of different spatial resolutions. Presses des MINES, 2002. 6 10

work page 2002

[55] [55]

Fusion of satellite images of different spatial resolutions: Assessing the quality of resulting images.Photogrammetric engineering and remote sensing, 63(6):691–699, 1997

Lucien Wald, Thierry Ranchin, and Marc Mangolini. Fusion of satellite images of different spatial resolutions: Assessing the quality of resulting images.Photogrammetric engineering and remote sensing, 63(6):691–699, 1997. 6

work page 1997

[56] [56]

Exploiting diffusion prior for real-world image super-resolution.International Journal of Computer Vision, 132(12):5929–5949, 2024

Jianyi Wang, Zongsheng Yue, Shangchen Zhou, Kelvin CK Chan, and Chen Change Loy. Exploiting diffusion prior for real-world image super-resolution.International Journal of Computer Vision, 132(12):5929–5949, 2024. 3

work page 2024

[57] [57]

Neural network diffusion

Kai Wang, Dongwen Tang, Boya Zeng, Yida Yin, Zhaopan Xu, Yukun Zhou, Zelin Zang, Trevor Darrell, Zhuang Liu, and Yang You. Neural network diffusion.arXiv preprint arXiv:2402.13144, 2024. 3

work page arXiv 2024

[58] [58]

Multi-scale- and-depth convolutional neural network for remote sensed imagery pan-sharpening

Yancong Wei, Qiangqiang Yuan, Xiangchao Meng, Huan- feng Shen, Liangpei Zhang, and Michael Ng. Multi-scale- and-depth convolutional neural network for remote sensed imagery pan-sharpening. In2017 IEEE International Geo- science and Remote Sensing Symposium (IGARSS), pages 3413–3416. IEEE, 2017. 7, 8, 2, 3, 4

work page 2017

[59] [59]

A post- classification change detection method based on iterative slow feature analysis and bayesian soft fusion.Remote Sensing of Environment, 199:241–255, 2017

Chen Wu, Bo Du, Xiaohui Cui, and Liangpei Zhang. A post- classification change detection method based on iterative slow feature analysis and bayesian soft fusion.Remote Sensing of Environment, 199:241–255, 2017. 1

work page 2017

[60] [60]

Dynamic cross feature fusion for remote sensing pan- sharpening

Xiao Wu, Ting-Zhu Huang, Liang-Jian Deng, and Tian-Jing Zhang. Dynamic cross feature fusion for remote sensing pan- sharpening. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 14687–14696, 2021. 2, 7, 8, 3, 4

work page 2021

[61] [61]

Lrtcfpan: Low-rank tensor completion based framework for pansharpen- ing.IEEE Transactions on Image Processing, 32:1640–1655,

Zhong-Cheng Wu, Ting-Zhu Huang, Liang-Jian Deng, Jie Huang, Jocelyn Chanussot, and Gemine Vivone. Lrtcfpan: Low-rank tensor completion based framework for pansharpen- ing.IEEE Transactions on Image Processing, 32:1640–1655,

work page

[62] [62]

A framelet sparse reconstruction method for pansharpening with guaranteed convergence.Inverse Problems and Imaging, 17(6):1277–1300, 2023

Zhong-Cheng Wu, Ting-Zhu Huang, Liang-Jian Deng, and Gemine Vivone. A framelet sparse reconstruction method for pansharpening with guaranteed convergence.Inverse Problems and Imaging, 17(6):1277–1300, 2023. 1

work page 2023

[63] [63]

Diffir: Efficient diffusion model for image restoration

Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, and Luc Van Gool. Diffir: Efficient diffusion model for image restoration. InProceed- ings of the IEEE/CVF International Conference on Computer Vision, pages 13095–13105, 2023. 2, 3, 6

work page 2023

[64] [64]

Hyperspectral pansharpening via diffusion models with iteratively zero-shot guidance

Jin-Liang Xiao, Ting-Zhu Huang, Liang-Jian Deng, Guang Lin, Zihan Cao, Chao Li, and Qibin Zhao. Hyperspectral pansharpening via diffusion models with iteratively zero-shot guidance. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12669–12678, 2025. 3

work page 2025

[65] [65]

Pannet: A deep network architecture for pan-sharpening

Junfeng Yang, Xueyang Fu, Yuwen Hu, Yue Huang, Xinghao Ding, and John Paisley. Pannet: A deep network architecture for pan-sharpening. InProceedings of the IEEE international conference on computer vision, pages 5449–5457, 2017. 2

work page 2017

[66] [66]

Deep learning for single image super-resolution: A brief review.IEEE Transactions on Multimedia, 21(12):3106–3121, 2019

Wenming Yang, Xuechen Zhang, Yapeng Tian, Wei Wang, Jing-Hao Xue, and Qingmin Liao. Deep learning for single image super-resolution: A brief review.IEEE Transactions on Multimedia, 21(12):3106–3121, 2019. 5

work page 2019

[67] [67]

A review of deep learning methods for semantic segmentation of remote sensing imagery.Expert Systems with Applications, 169: 114417, 2021

Xiaohui Yuan, Jianfang Shi, and Lichuan Gu. A review of deep learning methods for semantic segmentation of remote sensing imagery.Expert Systems with Applications, 169: 114417, 2021. 1

work page 2021

[68] [68]

Resshift: Efficient diffusion model for image super-resolution by resid- ual shifting.Advances in Neural Information Processing Systems, 36:13294–13307, 2023

Zongsheng Yue, Jianyi Wang, and Chen Change Loy. Resshift: Efficient diffusion model for image super-resolution by resid- ual shifting.Advances in Neural Information Processing Systems, 36:13294–13307, 2023. 5

work page 2023

[69] [69]

Dpm- solver-v3: Improved diffusion ode solver with empirical model statistics.Advances in Neural Information Processing Systems, 36:55502–55542, 2023

Kaiwen Zheng, Cheng Lu, Jianfei Chen, and Jun Zhu. Dpm- solver-v3: Improved diffusion ode solver with empirical model statistics.Advances in Neural Information Processing Systems, 36:55502–55542, 2023. 3, 5

work page 2023

[70] [70]

A wavelet transform method to merge landsat tm and spot panchromatic data.International journal of remote sensing, 19(4):743–757,

Jie Zhou, Daniel L Civco, and John A Silander. A wavelet transform method to merge landsat tm and spot panchromatic data.International journal of remote sensing, 19(4):743–757,

work page

[71] [71]

Denoising diffusion bridge models.arXiv preprint arXiv:2309.16948, 2023

Linqi Zhou, Aaron Lou, Samar Khanna, and Stefano Er- mon. Denoising diffusion bridge models.arXiv preprint arXiv:2309.16948, 2023. 3

work page arXiv 2023

[72] [72]

Pan-sharpening with customized transformer and invert- ible neural network

Man Zhou, Jie Huang, Yanchi Fang, Xueyang Fu, and Aiping Liu. Pan-sharpening with customized transformer and invert- ible neural network. InProceedings of the AAAI conference on artificial intelligence, pages 3553–3561, 2022. 2, 7, 8, 3, 4

work page 2022

[73] [73]

Memory-augmented deep unfolding network for guided image super-resolution.International Journal of Computer Vision, 131(1):215–242, 2023

Man Zhou, Keyu Yan, Jinshan Pan, Wenqi Ren, Qi Xie, and Xiangyong Cao. Memory-augmented deep unfolding network for guided image super-resolution.International Journal of Computer Vision, 131(1):215–242, 2023. 7, 8, 2, 3, 4 11 Fast Kernel-Space Diffusion for Remote Sensing Pansharpening Supplementary Material Abstract In this supplementary material, we fir...

work page 2023

[74] [74]

Pansharpening Network The baseline network we build is demonstrated in Fig

Methods Explanation 6.1. Pansharpening Network The baseline network we build is demonstrated in Fig. 7. The panchromatic (PAN) imageP∈R H×W×1 is first dupli- cated along the channel dimension to match the number of channels in the low-resolution multispectral (LRMS) image M∈R H×W×C . The duplicated PAN image is then sub- tracted by the LRMS image, and the...

work page

[75] [75]

Datasets We conducted experiments using datasets derived from WorldView-3 (WV3), QuickBird (QB), and GaoFen-2 (GF2) satellite imagery

Details on Experiments 7.1. Datasets We conducted experiments using datasets derived from WorldView-3 (WV3), QuickBird (QB), and GaoFen-2 (GF2) satellite imagery. These datasets consist of image patches cropped from full remote sensing scenes and are partitioned into training and testing subsets. The WV3 dataset contains four images from two geographic lo...

work page arXiv 2019

[76] [76]

Main Results Tab

Additional Results 8.1. Main Results Tab. 8 and Tab. 9 present the quantitative performance bench- marks on the full-resolution GF2 and QB datasets. The results indicate that the proposed KSDiff method exhibits strong generalization capabilities across different data do- mains. Fig. 9 to Fig. 14 provide qualitative comparisons of visual outputs generated ...

work page arXiv 1923