Fast Model-guided Instance-wise Adaptation Framework for Real-world Pansharpening with Fidelity Constraints

Gemine Vivone; Jin-Liang Xiao; Liang-Jian Deng; Shan Yin; Zhiqi Yang

arxiv: 2604.08903 · v1 · submitted 2026-04-10 · 💻 cs.CV

Fast Model-guided Instance-wise Adaptation Framework for Real-world Pansharpening with Fidelity Constraints

Zhiqi Yang , Jin-Liang Xiao , Shan Yin , Liang-Jian Deng , Gemine Vivone This is my paper

Pith reviewed 2026-05-10 17:50 UTC · model grok-4.3

classification 💻 cs.CV

keywords pansharpeningmodel-guided adaptationfidelity constraintsinstance-wise learningremote sensingimage fusionzero-shot generalization

0 comments

The pith

A pretrained model guides a lightweight network to fuse satellite images quickly while meeting spectral and physical constraints.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces FMG-Pan to solve the trade-off in pansharpening between supervised methods that need large datasets and zero-shot methods that are slow and lower quality. It does so by having a pretrained model steer a small adaptive network on each new image pair through joint training under added spectral and physical fidelity terms. If this works, practitioners could obtain high-resolution multispectral outputs from any sensor in seconds without retraining from scratch or collecting matched training data. The approach targets real-world conditions where test images come from different satellites or distributions than the original training set.

Core claim

FMG-Pan shows that a pretrained model can direct a lightweight adaptive network via joint optimization with spectral and physical fidelity constraints to deliver state-of-the-art pansharpening quality on real-world datasets while finishing training plus inference for a 512x512x8 image in under three seconds on an RTX 3090 GPU, outperforming prior zero-shot techniques in both quality and speed under intra- and cross-sensor tests.

What carries the argument

Model-guided instance-wise adaptation through joint optimization of a lightweight network with spectral and physical fidelity constraints, where the physical term is designed to preserve spatial details.

If this is right

The framework achieves both intra-sensor and cross-sensor generalization on real datasets without retraining the entire model.
The added physical fidelity term improves spatial detail retention compared with purely spectral constraints.
The per-instance adaptation runs fast enough for practical on-demand processing of satellite imagery.
Quality remains competitive with fully supervised methods while using far less data and compute per new sensor.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar model-guided adaptation could extend to other remote-sensing fusion tasks such as hyperspectral sharpening or multimodal registration.
The speed gain might allow real-time processing pipelines on edge hardware for disaster monitoring or agricultural imaging.
If the fidelity constraints prove robust, they could serve as plug-in regularizers for other instance-adaptive image restoration networks.

Load-bearing premise

A single pretrained model can reliably steer the lightweight network to high-quality results on any new real-world image pair without the adaptation step drifting or losing fidelity.

What would settle it

Running the method on a fresh cross-sensor dataset and finding that the output images score lower on standard pansharpening metrics than current zero-shot baselines, or that the full training-plus-inference time exceeds three seconds for a comparable image size.

Figures

Figures reproduced from arXiv: 2604.08903 by Gemine Vivone, Jin-Liang Xiao, Liang-Jian Deng, Shan Yin, Zhiqi Yang.

**Figure 2.** Figure 2: Overall framework of the proposed FMG-Pan framework. The pretrained model and the PF module are first executed [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Flowchart of the PF module. At the downsampled resolution, the coefficient matrix [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Comparison of two adaptive models. (1) Standard [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Visual fusion images and related HQNR maps on a full resolution WV3 example. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Visual fusion images and the related HQNR maps on a full resolution WV2 example. [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: Comparison of HQNR for FusionMamba and LAGNet with and without the FMG-Pan framework across two real-world datasets (WV3, WV2). Note: * denotes LAGNet, without * denotes FusionMamba. 4.4 Ablation Study We evaluated the contribution of each component in our framework on 20 full resolution WV3 samples using FusionMamba as the pretrained model. Four settings were considered: full FMGPan,FMG-Pan with Lighter… view at source ↗

**Figure 9.** Figure 9: Runtime composition of the proposed FMG-Pan [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗

**Figure 10.** Figure 10: Visual fusion images and the related HQNR maps [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗

**Figure 11.** Figure 11: Visual fusion images and the related HQNR maps [PITH_FULL_IMAGE:figures/full_fig_p013_11.png] view at source ↗

**Figure 12.** Figure 12: Comparison of HQNR for FusionMamba with and without the FMG-Pan framework across four real-world datasets (WV3, WV2, GF2, QB) We then fine-tune the adaptive model for only 30 epochs on the WV3 dataset, and for 100 epochs on the cross-sensor dataset (i.e., WV2). This setting could further reduce running times by reducing the epochs for the adaptive model on the training part. As reported in [PITH_FULL_IMA… view at source ↗

**Figure 13.** Figure 13: Comparison of efficiency and reconstruction quality on WV3. (a) Runtime, (b) GPU memory usage (c) HQNR. FMG [PITH_FULL_IMAGE:figures/full_fig_p014_13.png] view at source ↗

read the original abstract

Pansharpening aims to generate high-resolution multispectral (HRMS) images by fusing low-resolution multispectral (LRMS) and high-resolution panchromatic (PAN) images while preserving both spectral and spatial information. Although deep learning (DL)-based pansharpening methods achieve impressive performance, they require high training cost and large datasets, and often degrade when the test distribution differs from training, limiting generalization. Recent zero-shot methods, trained on a single PAN/LRMS pair, offer strong generalization but suffer from limited fusion quality, high computational overhead, and slow convergence. To address these issues, we propose FMG-Pan, a fast and generalizable model-guided instance-wise adaptation framework for real-world pansharpening, achieving both cross-sensor generality and rapid training-inference. The framework leverages a pretrained model to guide a lightweight adaptive network through joint optimization with spectral and physical fidelity constraints. We further design a novel physical fidelity term to enhance spatial detail preservation. Extensive experiments on real-world datasets under both intra- and cross-sensor settings demonstrate state-of-the-art performance. On the WorldView-3 dataset, FMG-Pan completes training and inference for a 512x512x8 image within 3 seconds on an RTX 3090 GPU, significantly faster than existing zero-shot methods, making it suitable for practical deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a concrete speed win for real-world pansharpening by adapting a pretrained model per image pair with lightweight tuning and fidelity terms, but the transfer reliability under sensor shift rests on unshown details.

read the letter

The colleague should know two things up front. First, FMG-Pan claims to deliver both higher fusion quality than prior zero-shot methods and training-plus-inference in about three seconds per 512x512x8 WorldView-3 image on an RTX 3090. Second, it does this by freezing a pretrained model and jointly optimizing a small adapter network under spectral and physical fidelity constraints rather than retraining everything or running long per-image optimization from scratch.

Referee Report

3 major / 2 minor

Summary. The paper proposes FMG-Pan, a model-guided instance-wise adaptation framework for real-world pansharpening. A pretrained model guides a lightweight adaptive network via joint optimization under spectral and physical fidelity constraints; the method is claimed to deliver SOTA fusion quality on real-world datasets in both intra- and cross-sensor regimes while completing training plus inference for a 512×512×8 image in 3 seconds on an RTX 3090.

Significance. If the central claims are substantiated, the work would be significant for practical deployment: it combines the generalization advantages of zero-shot methods with the quality of supervised approaches and offers a substantial runtime reduction over existing zero-shot baselines. The explicit use of fidelity constraints to regularize per-instance adaptation is a constructive direction for handling real sensor variability.

major comments (3)

[§4] §4 (Experiments), cross-sensor tables: the reported SOTA margins on WorldView-3 and other sensors are presented without ablation isolating the contribution of the physical fidelity term versus the spectral term or versus the pretrained-model guidance alone; without these controls it is impossible to verify that the adapter is being guided rather than merely compensating for domain shift.
[§3.2] §3.2 (Physical fidelity term): the exact functional form of the novel physical fidelity constraint is not shown to penalize sensor-specific spectral or spatial mismatches; if the term only enforces generic no-reference statistics, the joint optimization can converge to a fast but low-quality solution that still satisfies the reported metrics, undermining the transfer claim.
[§3.1] §3.1 (Joint optimization): the balancing weights for the fidelity constraints and the adaptation hyperparameters are listed as free parameters; the manuscript provides no sensitivity analysis or cross-sensor validation that these weights remain stable when the test distribution deviates from the pretraining data, which is load-bearing for the 3-second adaptation guarantee.

minor comments (2)

[Figure 2] Figure 2 and §3: the diagram of the lightweight adapter architecture would benefit from explicit layer counts and parameter totals to allow readers to reproduce the claimed speed advantage.
[§4.1] §4.1: the intra-sensor results would be clearer if the same no-reference metrics used for cross-sensor evaluation were also reported for the intra-sensor case.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments that help strengthen the validation of our proposed framework. We address each major comment point by point below, indicating planned revisions to the manuscript.

read point-by-point responses

Referee: [§4] §4 (Experiments), cross-sensor tables: the reported SOTA margins on WorldView-3 and other sensors are presented without ablation isolating the contribution of the physical fidelity term versus the spectral term or versus the pretrained-model guidance alone; without these controls it is impossible to verify that the adapter is being guided rather than merely compensating for domain shift.

Authors: We agree that targeted ablations are needed to isolate the individual contributions, especially to confirm the guidance effect in cross-sensor regimes. While the manuscript presents component-wise analysis in Section 4, we will add explicit ablation tables in the revised version that separately disable the physical fidelity term, the spectral term, and the pretrained-model guidance on the cross-sensor datasets (WorldView-3 and others) to directly address this concern. revision: yes
Referee: [§3.2] §3.2 (Physical fidelity term): the exact functional form of the novel physical fidelity constraint is not shown to penalize sensor-specific spectral or spatial mismatches; if the term only enforces generic no-reference statistics, the joint optimization can converge to a fast but low-quality solution that still satisfies the reported metrics, undermining the transfer claim.

Authors: The physical fidelity term is constructed using the sensor degradation operators (downsampling for spectral and blurring for spatial) derived from the input PAN/LRMS pair, making it inherently sensor-specific rather than generic no-reference statistics. To strengthen this, we will revise Section 3.2 with an expanded derivation showing the penalization of sensor-specific mismatches and add supporting experiments that compare solutions with and without the term under cross-sensor shifts. revision: yes
Referee: [§3.1] §3.1 (Joint optimization): the balancing weights for the fidelity constraints and the adaptation hyperparameters are listed as free parameters; the manuscript provides no sensitivity analysis or cross-sensor validation that these weights remain stable when the test distribution deviates from the pretraining data, which is load-bearing for the 3-second adaptation guarantee.

Authors: The weights were fixed after validation on pretraining data to support the rapid adaptation claim. We acknowledge the need for explicit validation of stability. In the revision we will include a sensitivity analysis section that varies the balancing weights and adaptation hyperparameters, reporting performance across both intra- and cross-sensor settings to confirm robustness. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical framework with independent experimental validation

full rationale

The paper introduces FMG-Pan as a practical adaptation method that combines a fixed pretrained model with a lightweight per-instance network under explicit spectral and physical fidelity losses. All central claims (cross-sensor performance, 3-second runtime on WorldView-3) rest on reported empirical results from intra- and cross-sensor datasets rather than any derivation that reduces a prediction to a fitted parameter or self-citation by construction. No equations are presented that define the output in terms of the input quantities being optimized; the fidelity terms are externally motivated regularizers, not tautological re-statements of the target metrics. The method is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard deep-learning assumptions plus several practical choices whose values are not derived from first principles.

free parameters (2)

fidelity constraint weights
Balancing coefficients between spectral and physical fidelity terms must be chosen or tuned for each setting.
adaptation hyperparameters
Learning rate, iteration count, and network architecture details for the lightweight adapter are selected to achieve the reported speed-quality trade-off.

axioms (1)

domain assumption A pretrained pansharpening model supplies useful guidance for rapid per-instance adaptation
The framework presupposes that the pretrained model encodes transferable knowledge that the lightweight network can leverage without catastrophic forgetting or mode collapse.

pith-pipeline@v0.9.0 · 5553 in / 1406 out tokens · 35562 ms · 2026-05-10T17:50:58.717669+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages

[1]

B Aiazzi, Luciano Alparone, S Baronti, R Carlà, Andrea Garzelli, and L Santurri

work page
[2]

In Image and signal processing for remote sensing XX, Vol

Full-scale assessment of pansharpening methods and data products. In Image and signal processing for remote sensing XX, Vol. 9244. SPIE, 924402

work page
[3]

Aiazzi, L

B. Aiazzi, L. Alparone, S. Baronti, A. Garzelli, and M. Selva. 2006. MTF-tailored multiscale fusion of high-resolution MS and Pan imagery.Photogrammetric Engineering and Remote Sensing72, 5 (2006), 591–596

work page 2006
[4]

P. J. Burt and E. H. Adelson. 1987. The Laplacian Pyramid as a Compact Image Code. InReadings in Computer Vision, Martin A. Fischler and Oscar Firschein (Eds.). Morgan Kaufmann, San Francisco (CA), 671–679. doi:10.1016/B978-0-08- 051581-6.50065-9

work page doi:10.1016/b978-0-08- 1987
[5]

Cao, L.-J

Q. Cao, L.-J. Deng, W. Wang, J. Hou, and G. Vivone. 2024. Zero-shot semi- supervised learning for pansharpening.Information Fusion101 (2024), 102001

work page 2024
[6]

Z.-H. Cao, S. Cao, L.-J. Deng, X. Wu, J. Hou, and G. Vivone. 2024. Diffusion model with disentangled modulations for sharpening multispectral and hyperspectral images.Information Fusion104 (2024), 102158

work page 2024
[7]

Kristianto, G

Z.-H. Cao, Y.-J. Liang, L.-J. Deng, and G. Vivone. 2025. An Efficient Image Fusion Network Exploiting Unifying Language and Mask Guidance.IEEE Transactions on Pattern Analysis and Machine Intelligence(2025), 1–18. doi:10.1109/TPAMI. 2025.3591930

work page doi:10.1109/tpami 2025
[8]

Carper, T

W. Carper, T. Lillesand, and R. Kiefer. 1990. The use of intensity-hue-saturation transformations for merging SPOT panchromatic and multispectral image data. Photogrammetric Engineering and Remote Sensing56, 4 (1990), 459–467

work page 1990
[9]

Bailey, Walter F

M. Ciotola, S. Vitale, A. Mazza, G. Poggi, and G. Scarpa. 2022. Pansharpening by Convolutional Neural Networks in the Full Resolution Framework.IEEE Transactions on Geoscience and Remote Sensing60 (2022), 1–17. doi:10.1109/TGRS. 2022.3163887

work page doi:10.1109/tgrs 2022
[10]

T. F. Coleman and Y. Li. 1996. A reflective Newton method for minimizing a quadratic function subject to bounds on some of the variables.SIAM J. Optim.6, 4 (1996), 1040–1058

work page 1996
[11]

Liang-Jian Deng, Minyu Feng, and Xue-Cheng Tai. 2019. The fusion of panchro- matic and multispectral remote sensing images via tensor-based sparse modeling and hyper-Laplacian prior.Information Fusion52 (2019), 76–89

work page 2019
[12]

L.-J. Deng, M. Feng, and X.-C. Tai. 2019. The fusion of panchromatic and mul- tispectral remote sensing images via tensor-based sparse modeling and hyper- Laplacian prior.Information Fusion52 (2019), 76–89. doi:10.1016/j.inffus.2018.11. 014

work page doi:10.1016/j.inffus.2018.11 2019
[13]

L.-J. Deng, G. Vivone, W. Guo, M. Dalla Mura, and J. Chanussot. 2018. A Vari- ational Pansharpening Approach Based on Reproducible Kernel Hilbert Space and Heaviside Function.IEEE Transactions on Image Processing27, 9 (2018), 4330–4344. doi:10.1109/TIP.2018.2839531

work page doi:10.1109/tip.2018.2839531 2018
[14]

L.-J. Deng, G. Vivone, M. E. Paoletti, G. Scarpa, J. He, Y. Zhang, J. Chanussot, and A. Plaza. 2022. Machine learning in pansharpening: A benchmark, from shallow to deep networks.IEEE Geoscience and Remote Sensing Magazine10, 3 (2022), 279–315

work page 2022
[15]

Deng, L.-J

S.-Q. Deng, L.-J. Deng, X. Wu, R. Ran, D. Hong, and G. Vivone. 2023. PSRT: Pyramid shuffle-and-reshuffle transformer for multispectral and hyperspectral image fusion.IEEE Transactions on Geoscience and Remote Sensing61 (2023), 1–15

work page 2023
[16]

Garzelli

A. Garzelli. 2014. Pansharpening of multispectral images based on nonlocal parameter optimization.IEEE Transactions on Geoscience and Remote Sensing53, 4 (2014), 2096–2107

work page 2014
[17]

Garzelli, F

A. Garzelli, F. Nencini, and L. Capobianco. 2007. Optimal MMSE pan sharpening of very high resolution multispectral images.IEEE Transactions on Geoscience and Remote Sensing46, 1 (2007), 228–236

work page 2007
[18]

L. He, Y. Rao, J. Li, J. Chanussot, A. Plaza, J. Zhu, and B. Li. 2019. Pansharpening via Detail Injection Based Convolutional Neural Networks.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing12, 4 (2019), 1188–1204. doi:10.1109/JSTARS.2019.2898574

work page doi:10.1109/jstars.2019.2898574 2019
[19]

Huang, R

J. Huang, R. Huang, J. Xu, S. Peng, Y. Duan, and L.-J. Deng. 2025. Wavelet-Assisted Multi-Frequency Attention Network for Pansharpening. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), Vol. 39. 3662–3670

work page 2025
[20]

Jin, T.-J

Z.-R. Jin, T.-J. Zhang, T.-X. Jiang, G. Vivone, and L.-J. Deng. 2022. LAGConv: Local-context adaptive convolution kernels with global harmonic bias for pan- sharpening. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), Vol. 36. 1113–1121

work page 2022
[21]

Kwarteng and A

P. Kwarteng and A. Chavez. 1989. Extracting spectral contrast in Landsat The- matic Mapper image data using selective principal component analysis.Pho- togrammetric Engineering and Remote Sensing55, 1 (1989), 339–348

work page 1989
[22]

JG Liu. 2000. Smoothing filter-based intensity modulation: A spectral preserve image fusion technique for improving spatial details.International Journal of Remote Sensing21, 18 (2000), 3461–3472

work page 2000
[23]

Lolli, L

S. Lolli, L. Alparone, A. Garzelli, and G. Vivone. 2017. Haze correction for contrast-based multispectral pansharpening.IEEE Geoscience and Remote Sensing Letters14, 12 (2017), 2255–2259

work page 2017
[24]

G. Masi, D. Cozzolino, L. Verdoliva, and G. Scarpa. 2016. Pansharpening by convolutional neural networks.Remote Sensing8, 7 (2016), 594

work page 2016
[25]

Q. Meng, W. Shi, S. Li, and L. Zhang. 2023. PanDiff: A novel pansharpening method based on denoising diffusion probabilistic model.IEEE Transactions on Geoscience and Remote Sensing61 (2023), 1–17

work page 2023
[26]

S. Peng, X. Zhu, H. Deng, L.-J. Deng, and Z. Lei. 2024. Fusionmamba: Efficient remote sensing image fusion with state space model.IEEE Transactions on Geoscience and Remote Sensing62 (2024), 1–16

work page 2024
[27]

Restaino, G

R. Restaino, G. Vivone, M. Dalla Mura, and J. Chanussot. 2016. Fusion of mul- tispectral and panchromatic images based on morphological operators.IEEE Transactions on Image Processing25, 6 (2016), 2882–2895

work page 2016
[28]

X. Rui, X. Cao, Y. Li, and D. Meng. 2024. Variational Zero-Shot Multispectral Pansharpening.IEEE Transactions on Geoscience and Remote Sensing62 (2024), 1–16. doi:10.1109/TGRS.2024.3492059

work page doi:10.1109/tgrs.2024.3492059 2024
[29]

G. Vivone. 2019. Robust band-dependent spatial-detail approaches for panchro- matic sharpening.IEEE Transactions on Geoscience and Remote Sensing57, 9 (2019), 6421–6433

work page 2019
[30]

Vivone, L

G. Vivone, L. Alparone, J. Chanussot, M. Dalla Mura, A. Garzelli, G. A. Licciardi, R. Restaino, and L. Wald. 2014. A critical comparison among pansharpening algorithms.IEEE Transactions on Geoscience and Remote Sensing53, 5 (2014), 2565–2586

work page 2014
[31]

Vivone, L.-J

G. Vivone, L.-J. Deng, S. Deng, D. Hong, M. Jiang, C. Li, W. Li, H. Shen, X. Wu, J.-L. Xiao, J. Yao, M. Zhang, J. Chanussot, S. García, and A. Plaza. 2025. Deep Learning in Remote Sensing Image Fusion: Methods, protocols, data, and future perspectives.IEEE Geoscience and Remote Sensing Magazine13, 1 (2025), 269–310. doi:10.1109/MGRS.2024.3495516

work page doi:10.1109/mgrs.2024.3495516 2025
[32]

Vivone, M

G. Vivone, M. Dalla Mura, A. Garzelli, R. Restaino, G. Scarpa, M. O. Ulfarsson, L. Alparone, and J. Chanussot. 2020. A new benchmark based on recent advances in multispectral pansharpening: Revisiting pansharpening with classical and emerging pansharpening methods.IEEE Geoscience and Remote Sensing Magazine 9, 1 (2020), 53–81

work page 2020
[33]

Vivone, R

G. Vivone, R. Restaino, and J. Chanussot. 2018. Full scale regression-based injection coefficients for panchromatic sharpening.IEEE Transactions on Image Processing27, 7 (2018), 3418–3431

work page 2018
[34]

H. Wang, H. Zhang, X. Tian, and J. Ma. 2024. Zero-Sharpen: A universal pan- sharpening method across satellites for reducing scale-variance gap via zero-shot variation.Information Fusion101 (2024), 102003

work page 2024
[35]

Wu Wang, Liang-Jian Deng, Ran Ran, and Gemine Vivone. 2024. A general paradigm with detail-preserving conditional invertible network for image fusion. International Journal of Computer Vision132, 4 (2024), 1029–1054

work page 2024
[36]

Wu, Z.-H

X. Wu, Z.-H. Cao, T.-Z. Huang, L.-J. Deng, J. Chanussot, and G. Vivone. 2025. Fully-Connected Transformer for Multi-Source Image Fusion.IEEE Transactions on Pattern Analysis and Machine Intelligence47, 3 (2025), 2071–2088

work page 2025
[37]

Wu, T.-Z

Z.-C. Wu, T.-Z. Huang, L.-J. Deng, J. Huang, J. Chanussot, and G. Vivone. 2023. LRTCFPan: Low-rank tensor completion based framework for pansharpening. IEEE Transactions on Image Processing32 (2023), 1640–1655

work page 2023
[38]

Xiao, T.-Z

J.-L. Xiao, T.-Z. Huang, L.-J. Deng, G. Lin, Z. Cao, C. Li, and Q. Zhao. 2025. Hyperspectral Pansharpening via Diffusion Models with Iteratively Zero-Shot Guidance. InProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR). 12669–12678

work page 2025
[39]

Xiao, T.-Z

J.-L. Xiao, T.-Z. Huang, L.-J. Deng, Z.-C. Wu, X. Wu, and G. Vivone. 2023. Varia- tional pansharpening based on coefficient estimation with nonlocal regression. IEEE Transactions on Geoscience and Remote Sensing61 (2023), 1–15

work page 2023
[40]

J. Yang, X. Fu, Y. Hu, Y. Huang, X. Ding, and J. Paisley. 2017. PanNet: A deep network architecture for pan-sharpening. InProceedings of the IEEE International Conference on Computer Vision (ICCV). 5449–5457

work page 2017
[41]

Zhong, X

Y. Zhong, X. Wu, Z. Cao, H.-X. Dou, and L.-J. Deng. 2024. Ssdiff: Spatial-spectral integrated diffusion model for remote sensing pansharpening.Advances in Neural Information Processing Systems37 (2024), 77962–77986

work page 2024
[42]

H. Zhou, Q. Liu, and Y. Wang. 2022. PanFormer: A transformer based model for pan-sharpening. In2022 IEEE international conference on multimedia and expo (ICME). IEEE, 1–6

work page 2022
[43]

Zhou, D.L

J. Zhou, D.L. Civco, and J.A. Silander. 1998. A wavelet transform method to merge Landsat TM and SPOT panchromatic data.International Journal of Remote Sensing19, 4 (1998), 743–757. doi:10.1080/014311698215973 9

work page doi:10.1080/014311698215973 1998
[44]

strongly supervised models with test data in the same-domain

Xiao Xiang Zhu and Richard Bamler. 2012. A sparse image fusion algorithm with application to pan-sharpening.IEEE Transactions on geoscience and remote sensing51, 5 (2012), 2827–2836. 10 Supplementary Material S1 Time Analysis Fig. 8 shows our efficiency advantage compared to previous zero- shot methods. Fig. 9 provides a breakdown of runtime composition w...

work page arXiv 2012

[1] [1]

B Aiazzi, Luciano Alparone, S Baronti, R Carlà, Andrea Garzelli, and L Santurri

work page

[2] [2]

In Image and signal processing for remote sensing XX, Vol

Full-scale assessment of pansharpening methods and data products. In Image and signal processing for remote sensing XX, Vol. 9244. SPIE, 924402

work page

[3] [3]

Aiazzi, L

B. Aiazzi, L. Alparone, S. Baronti, A. Garzelli, and M. Selva. 2006. MTF-tailored multiscale fusion of high-resolution MS and Pan imagery.Photogrammetric Engineering and Remote Sensing72, 5 (2006), 591–596

work page 2006

[4] [4]

P. J. Burt and E. H. Adelson. 1987. The Laplacian Pyramid as a Compact Image Code. InReadings in Computer Vision, Martin A. Fischler and Oscar Firschein (Eds.). Morgan Kaufmann, San Francisco (CA), 671–679. doi:10.1016/B978-0-08- 051581-6.50065-9

work page doi:10.1016/b978-0-08- 1987

[5] [5]

Cao, L.-J

Q. Cao, L.-J. Deng, W. Wang, J. Hou, and G. Vivone. 2024. Zero-shot semi- supervised learning for pansharpening.Information Fusion101 (2024), 102001

work page 2024

[6] [6]

Z.-H. Cao, S. Cao, L.-J. Deng, X. Wu, J. Hou, and G. Vivone. 2024. Diffusion model with disentangled modulations for sharpening multispectral and hyperspectral images.Information Fusion104 (2024), 102158

work page 2024

[7] [7]

Kristianto, G

Z.-H. Cao, Y.-J. Liang, L.-J. Deng, and G. Vivone. 2025. An Efficient Image Fusion Network Exploiting Unifying Language and Mask Guidance.IEEE Transactions on Pattern Analysis and Machine Intelligence(2025), 1–18. doi:10.1109/TPAMI. 2025.3591930

work page doi:10.1109/tpami 2025

[8] [8]

Carper, T

W. Carper, T. Lillesand, and R. Kiefer. 1990. The use of intensity-hue-saturation transformations for merging SPOT panchromatic and multispectral image data. Photogrammetric Engineering and Remote Sensing56, 4 (1990), 459–467

work page 1990

[9] [9]

Bailey, Walter F

M. Ciotola, S. Vitale, A. Mazza, G. Poggi, and G. Scarpa. 2022. Pansharpening by Convolutional Neural Networks in the Full Resolution Framework.IEEE Transactions on Geoscience and Remote Sensing60 (2022), 1–17. doi:10.1109/TGRS. 2022.3163887

work page doi:10.1109/tgrs 2022

[10] [10]

T. F. Coleman and Y. Li. 1996. A reflective Newton method for minimizing a quadratic function subject to bounds on some of the variables.SIAM J. Optim.6, 4 (1996), 1040–1058

work page 1996

[11] [11]

Liang-Jian Deng, Minyu Feng, and Xue-Cheng Tai. 2019. The fusion of panchro- matic and multispectral remote sensing images via tensor-based sparse modeling and hyper-Laplacian prior.Information Fusion52 (2019), 76–89

work page 2019

[12] [12]

L.-J. Deng, M. Feng, and X.-C. Tai. 2019. The fusion of panchromatic and mul- tispectral remote sensing images via tensor-based sparse modeling and hyper- Laplacian prior.Information Fusion52 (2019), 76–89. doi:10.1016/j.inffus.2018.11. 014

work page doi:10.1016/j.inffus.2018.11 2019

[13] [13]

L.-J. Deng, G. Vivone, W. Guo, M. Dalla Mura, and J. Chanussot. 2018. A Vari- ational Pansharpening Approach Based on Reproducible Kernel Hilbert Space and Heaviside Function.IEEE Transactions on Image Processing27, 9 (2018), 4330–4344. doi:10.1109/TIP.2018.2839531

work page doi:10.1109/tip.2018.2839531 2018

[14] [14]

L.-J. Deng, G. Vivone, M. E. Paoletti, G. Scarpa, J. He, Y. Zhang, J. Chanussot, and A. Plaza. 2022. Machine learning in pansharpening: A benchmark, from shallow to deep networks.IEEE Geoscience and Remote Sensing Magazine10, 3 (2022), 279–315

work page 2022

[15] [15]

Deng, L.-J

S.-Q. Deng, L.-J. Deng, X. Wu, R. Ran, D. Hong, and G. Vivone. 2023. PSRT: Pyramid shuffle-and-reshuffle transformer for multispectral and hyperspectral image fusion.IEEE Transactions on Geoscience and Remote Sensing61 (2023), 1–15

work page 2023

[16] [16]

Garzelli

A. Garzelli. 2014. Pansharpening of multispectral images based on nonlocal parameter optimization.IEEE Transactions on Geoscience and Remote Sensing53, 4 (2014), 2096–2107

work page 2014

[17] [17]

Garzelli, F

A. Garzelli, F. Nencini, and L. Capobianco. 2007. Optimal MMSE pan sharpening of very high resolution multispectral images.IEEE Transactions on Geoscience and Remote Sensing46, 1 (2007), 228–236

work page 2007

[18] [18]

L. He, Y. Rao, J. Li, J. Chanussot, A. Plaza, J. Zhu, and B. Li. 2019. Pansharpening via Detail Injection Based Convolutional Neural Networks.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing12, 4 (2019), 1188–1204. doi:10.1109/JSTARS.2019.2898574

work page doi:10.1109/jstars.2019.2898574 2019

[19] [19]

Huang, R

J. Huang, R. Huang, J. Xu, S. Peng, Y. Duan, and L.-J. Deng. 2025. Wavelet-Assisted Multi-Frequency Attention Network for Pansharpening. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), Vol. 39. 3662–3670

work page 2025

[20] [20]

Jin, T.-J

Z.-R. Jin, T.-J. Zhang, T.-X. Jiang, G. Vivone, and L.-J. Deng. 2022. LAGConv: Local-context adaptive convolution kernels with global harmonic bias for pan- sharpening. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), Vol. 36. 1113–1121

work page 2022

[21] [21]

Kwarteng and A

P. Kwarteng and A. Chavez. 1989. Extracting spectral contrast in Landsat The- matic Mapper image data using selective principal component analysis.Pho- togrammetric Engineering and Remote Sensing55, 1 (1989), 339–348

work page 1989

[22] [22]

JG Liu. 2000. Smoothing filter-based intensity modulation: A spectral preserve image fusion technique for improving spatial details.International Journal of Remote Sensing21, 18 (2000), 3461–3472

work page 2000

[23] [23]

Lolli, L

S. Lolli, L. Alparone, A. Garzelli, and G. Vivone. 2017. Haze correction for contrast-based multispectral pansharpening.IEEE Geoscience and Remote Sensing Letters14, 12 (2017), 2255–2259

work page 2017

[24] [24]

G. Masi, D. Cozzolino, L. Verdoliva, and G. Scarpa. 2016. Pansharpening by convolutional neural networks.Remote Sensing8, 7 (2016), 594

work page 2016

[25] [25]

Q. Meng, W. Shi, S. Li, and L. Zhang. 2023. PanDiff: A novel pansharpening method based on denoising diffusion probabilistic model.IEEE Transactions on Geoscience and Remote Sensing61 (2023), 1–17

work page 2023

[26] [26]

S. Peng, X. Zhu, H. Deng, L.-J. Deng, and Z. Lei. 2024. Fusionmamba: Efficient remote sensing image fusion with state space model.IEEE Transactions on Geoscience and Remote Sensing62 (2024), 1–16

work page 2024

[27] [27]

Restaino, G

R. Restaino, G. Vivone, M. Dalla Mura, and J. Chanussot. 2016. Fusion of mul- tispectral and panchromatic images based on morphological operators.IEEE Transactions on Image Processing25, 6 (2016), 2882–2895

work page 2016

[28] [28]

X. Rui, X. Cao, Y. Li, and D. Meng. 2024. Variational Zero-Shot Multispectral Pansharpening.IEEE Transactions on Geoscience and Remote Sensing62 (2024), 1–16. doi:10.1109/TGRS.2024.3492059

work page doi:10.1109/tgrs.2024.3492059 2024

[29] [29]

G. Vivone. 2019. Robust band-dependent spatial-detail approaches for panchro- matic sharpening.IEEE Transactions on Geoscience and Remote Sensing57, 9 (2019), 6421–6433

work page 2019

[30] [30]

Vivone, L

G. Vivone, L. Alparone, J. Chanussot, M. Dalla Mura, A. Garzelli, G. A. Licciardi, R. Restaino, and L. Wald. 2014. A critical comparison among pansharpening algorithms.IEEE Transactions on Geoscience and Remote Sensing53, 5 (2014), 2565–2586

work page 2014

[31] [31]

Vivone, L.-J

G. Vivone, L.-J. Deng, S. Deng, D. Hong, M. Jiang, C. Li, W. Li, H. Shen, X. Wu, J.-L. Xiao, J. Yao, M. Zhang, J. Chanussot, S. García, and A. Plaza. 2025. Deep Learning in Remote Sensing Image Fusion: Methods, protocols, data, and future perspectives.IEEE Geoscience and Remote Sensing Magazine13, 1 (2025), 269–310. doi:10.1109/MGRS.2024.3495516

work page doi:10.1109/mgrs.2024.3495516 2025

[32] [32]

Vivone, M

G. Vivone, M. Dalla Mura, A. Garzelli, R. Restaino, G. Scarpa, M. O. Ulfarsson, L. Alparone, and J. Chanussot. 2020. A new benchmark based on recent advances in multispectral pansharpening: Revisiting pansharpening with classical and emerging pansharpening methods.IEEE Geoscience and Remote Sensing Magazine 9, 1 (2020), 53–81

work page 2020

[33] [33]

Vivone, R

G. Vivone, R. Restaino, and J. Chanussot. 2018. Full scale regression-based injection coefficients for panchromatic sharpening.IEEE Transactions on Image Processing27, 7 (2018), 3418–3431

work page 2018

[34] [34]

H. Wang, H. Zhang, X. Tian, and J. Ma. 2024. Zero-Sharpen: A universal pan- sharpening method across satellites for reducing scale-variance gap via zero-shot variation.Information Fusion101 (2024), 102003

work page 2024

[35] [35]

Wu Wang, Liang-Jian Deng, Ran Ran, and Gemine Vivone. 2024. A general paradigm with detail-preserving conditional invertible network for image fusion. International Journal of Computer Vision132, 4 (2024), 1029–1054

work page 2024

[36] [36]

Wu, Z.-H

X. Wu, Z.-H. Cao, T.-Z. Huang, L.-J. Deng, J. Chanussot, and G. Vivone. 2025. Fully-Connected Transformer for Multi-Source Image Fusion.IEEE Transactions on Pattern Analysis and Machine Intelligence47, 3 (2025), 2071–2088

work page 2025

[37] [37]

Wu, T.-Z

Z.-C. Wu, T.-Z. Huang, L.-J. Deng, J. Huang, J. Chanussot, and G. Vivone. 2023. LRTCFPan: Low-rank tensor completion based framework for pansharpening. IEEE Transactions on Image Processing32 (2023), 1640–1655

work page 2023

[38] [38]

Xiao, T.-Z

J.-L. Xiao, T.-Z. Huang, L.-J. Deng, G. Lin, Z. Cao, C. Li, and Q. Zhao. 2025. Hyperspectral Pansharpening via Diffusion Models with Iteratively Zero-Shot Guidance. InProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR). 12669–12678

work page 2025

[39] [39]

Xiao, T.-Z

J.-L. Xiao, T.-Z. Huang, L.-J. Deng, Z.-C. Wu, X. Wu, and G. Vivone. 2023. Varia- tional pansharpening based on coefficient estimation with nonlocal regression. IEEE Transactions on Geoscience and Remote Sensing61 (2023), 1–15

work page 2023

[40] [40]

J. Yang, X. Fu, Y. Hu, Y. Huang, X. Ding, and J. Paisley. 2017. PanNet: A deep network architecture for pan-sharpening. InProceedings of the IEEE International Conference on Computer Vision (ICCV). 5449–5457

work page 2017

[41] [41]

Zhong, X

Y. Zhong, X. Wu, Z. Cao, H.-X. Dou, and L.-J. Deng. 2024. Ssdiff: Spatial-spectral integrated diffusion model for remote sensing pansharpening.Advances in Neural Information Processing Systems37 (2024), 77962–77986

work page 2024

[42] [42]

H. Zhou, Q. Liu, and Y. Wang. 2022. PanFormer: A transformer based model for pan-sharpening. In2022 IEEE international conference on multimedia and expo (ICME). IEEE, 1–6

work page 2022

[43] [43]

Zhou, D.L

J. Zhou, D.L. Civco, and J.A. Silander. 1998. A wavelet transform method to merge Landsat TM and SPOT panchromatic data.International Journal of Remote Sensing19, 4 (1998), 743–757. doi:10.1080/014311698215973 9

work page doi:10.1080/014311698215973 1998

[44] [44]

strongly supervised models with test data in the same-domain

Xiao Xiang Zhu and Richard Bamler. 2012. A sparse image fusion algorithm with application to pan-sharpening.IEEE Transactions on geoscience and remote sensing51, 5 (2012), 2827–2836. 10 Supplementary Material S1 Time Analysis Fig. 8 shows our efficiency advantage compared to previous zero- shot methods. Fig. 9 provides a breakdown of runtime composition w...

work page arXiv 2012