Degradation-Robust Fusion: An Efficient Degradation-Aware Diffusion Framework for Multimodal Image Fusion in Arbitrary Degradation Scenarios
Pith reviewed 2026-05-10 17:24 UTC · model grok-4.3
The pith
Direct regression of the fused image in a diffusion process with joint constraint correction enables robust multimodal fusion under arbitrary degradations using few sampling steps.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that performing implicit denoising by directly regressing the fused image within a diffusion-style process, combined with a joint observation model correction that imposes degradation and fusion constraints simultaneously during sampling, allows for efficient and accurate multimodal image fusion in arbitrary degradation scenarios without requiring explicit noise modeling or large numbers of sampling steps.
What carries the argument
The joint observation model correction mechanism, which simultaneously applies degradation and fusion constraints during the limited-step sampling process in a direct-regression diffusion framework.
If this is right
- Fusion performance remains high even when input images suffer from combined degradations like noise and blur.
- The framework supports multiple multimodal tasks such as infrared-visible fusion and medical image fusion under the same model structure.
- High reconstruction accuracy is achieved with significantly fewer sampling steps compared to conventional diffusion approaches.
- Complementary information from multiple degraded sources is captured effectively through the regression-based process.
Where Pith is reading between the lines
- This regression strategy might extend to other multimodal inverse problems where ground-truth combined data is scarce.
- Removing the need for natural fused training data could lower barriers for applying generative models in fusion research.
- The interpretability gains from the structured sampling process may help diagnose failure cases in fusion outputs.
Load-bearing premise
Directly regressing the fused image effectively captures complementary information from multiple degraded inputs without explicit noise modeling, and the joint correction accurately enforces both types of constraints even with limited sampling steps.
What would settle it
An ablation study removing the joint observation model correction and measuring the resulting drop in fusion quality metrics on standard benchmarks with added combined degradations such as noise plus blur would test whether the mechanism is essential for the claimed accuracy.
Figures
read the original abstract
Complex degradations like noise, blur, and low resolution are typical challenges in real world image fusion tasks, limiting the performance and practicality of existing methods. End to end neural network based approaches are generally simple to design and highly efficient in inference, but their black-box nature leads to limited interpretability. Diffusion based methods alleviate this to some extent by providing powerful generative priors and a more structured inference process. However, they are trained to learn a single domain target distribution, whereas fusion lacks natural fused data and relies on modeling complementary information from multiple sources, making diffusion hard to apply directly in practice. To address these challenges, this paper proposes an efficient degradation aware diffusion framework for image fusion under arbitrary degradation scenarios. Specifically, instead of explicitly predicting noise as in conventional diffusion models, our method performs implicit denoising by directly regressing the fused image, enabling flexible adaptation to diverse fusion tasks under complex degradations with limited steps. Moreover, we design a joint observation model correction mechanism that simultaneously imposes degradation and fusion constraints during sampling to ensure high reconstruction accuracy. Experiments on diverse fusion tasks and degradation configurations demonstrate the superiority of the proposed method under complex degradation scenarios.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes an efficient degradation-aware diffusion framework for multimodal image fusion under arbitrary real-world degradations (noise, blur, low resolution). It replaces the standard noise-prediction objective with direct regression of the fused image for implicit denoising, enabling adaptation to diverse tasks with limited sampling steps, and introduces a joint observation model correction mechanism that enforces both degradation consistency and fusion constraints during sampling. Experiments on various fusion tasks and degradation configurations are claimed to demonstrate superiority over existing methods.
Significance. If the central claims hold, the work would offer a practical advance in applying diffusion models to fusion problems that lack natural ground-truth fused data. The direct-regression approach combined with a joint correction could provide efficiency and some interpretability gains over black-box end-to-end networks while leveraging generative priors, with potential impact on applications such as infrared-visible or medical image fusion where degradations are common. The emphasis on limited-step sampling addresses a key practicality barrier for diffusion in this domain.
major comments (2)
- [Abstract and Method] Abstract and Method: The substitution of noise prediction with direct regression of the fused image is load-bearing for the efficiency and 'implicit denoising' claims, yet the description provides no derivation showing how this still yields a valid reverse process that aggregates complementary information across arbitrarily degraded sources; the skeptic's concern that mismatches will compound rather than denoise away in limited steps therefore requires explicit analysis or comparison to standard diffusion objectives.
- [Method] Method: The joint observation model correction mechanism is presented as simultaneously imposing degradation and fusion constraints to ensure high accuracy, but without its mathematical formulation, derivation, or proof that it avoids posterior mismatch in the limited-step regime, it is impossible to verify whether the mechanism is robust or merely ad-hoc; this is central to the reconstruction-accuracy claim.
minor comments (2)
- [Abstract] Abstract: The phrase 'arbitrary degradation scenarios' is used without enumerating the specific degradation types or combinations tested; adding this would strengthen the claim of generality.
- [Experiments] Experiments: While superiority is asserted, the abstract supplies no quantitative metrics, baselines, or ablation results; the full manuscript should ensure these are clearly tabulated with statistical significance tests.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will revise the paper to strengthen the theoretical justification of our approach.
read point-by-point responses
-
Referee: [Abstract and Method] Abstract and Method: The substitution of noise prediction with direct regression of the fused image is load-bearing for the efficiency and 'implicit denoising' claims, yet the description provides no derivation showing how this still yields a valid reverse process that aggregates complementary information across arbitrarily degraded sources; the skeptic's concern that mismatches will compound rather than denoise away in limited steps therefore requires explicit analysis or comparison to standard diffusion objectives.
Authors: We appreciate the referee's concern that the substitution of noise prediction with direct regression of the fused image requires explicit justification to confirm it produces a valid reverse process. In the revised manuscript, we will add a detailed derivation of the reverse diffusion process under this objective, showing how the joint constraints enable aggregation of complementary information from arbitrarily degraded sources. We will also include a direct comparison to standard DDPM noise-prediction objectives and analysis demonstrating that mismatches are mitigated (rather than compounded) during limited-step sampling. revision: yes
-
Referee: [Method] Method: The joint observation model correction mechanism is presented as simultaneously imposing degradation and fusion constraints to ensure high accuracy, but without its mathematical formulation, derivation, or proof that it avoids posterior mismatch in the limited-step regime, it is impossible to verify whether the mechanism is robust or merely ad-hoc; this is central to the reconstruction-accuracy claim.
Authors: We acknowledge that the current manuscript does not provide sufficient mathematical detail on the joint observation model correction mechanism. In the revision, we will include the complete formulation, a step-by-step derivation, and analysis (including discussion of posterior mismatch) to demonstrate that the mechanism robustly enforces both degradation consistency and fusion constraints in the limited-step regime, rather than relying on ad-hoc choices. revision: yes
Circularity Check
No circularity: direct regression target and joint correction are explicit design choices, not reductions to inputs
full rationale
The paper explicitly replaces the standard noise-prediction objective of diffusion models with direct regression of the fused image and introduces a joint observation model correction applied during sampling. These are presented as methodological adaptations to address the absence of natural fused targets and arbitrary degradations, rather than any derivation that equates a prediction to its own fitted parameters or self-referential definitions. No equations reduce claimed performance to quantities defined by the model itself, no uniqueness theorems are imported from self-citations, and no ansatzes are smuggled via prior work. Claims rest on experimental validation across tasks rather than tautological construction. This is the common case of a self-contained proposal adapting existing priors with new components.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Diffusion models can be effectively repurposed for fusion tasks by replacing explicit noise prediction with direct regression of the target fused image
invented entities (1)
-
joint observation model correction mechanism
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Ensemble of cnn for multi-focus image fusion.Information Fusion, 51:201–214, 2019
Mostafa Amin-Naji, Ali Aghagolzadeh, and Mehdi Ezoji. Ensemble of cnn for multi-focus image fusion.Information Fusion, 51:201–214, 2019. 2
work page 2019
-
[2]
A novel state space model with local enhancement and state sharing for image fusion
Zihan Cao, Xiao Wu, Liang-Jian Deng, and Yu Zhong. A novel state space model with local enhancement and state sharing for image fusion. InProceedings of the 32nd ACM International Conference on Multimedia, pages 1235–1244,
-
[3]
Bin Chen, Zhenyu Zhang, Weiqi Li, Chen Zhao, Jiwen Yu, Shijie Zhao, Jie Chen, and Jian Zhang. Invertible diffusion models for compressed sensing.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 2025. 3, 4
work page 2025
-
[4]
Mdb- fusion: a visible and infrared image fusion framework ca- pable for motion deblurring
Jun Chen, Wei Yu, Xin Tian, Jun Huang, and Jiayi Ma. Mdb- fusion: a visible and infrared image fusion framework ca- pable for motion deblurring. In2024 IEEE International Conference on Image Processing (ICIP), pages 1019–1025. IEEE, 2024. 1
work page 2024
-
[5]
Diffusion models in vision: A survey
Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu, and Mubarak Shah. Diffusion models in vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 45(9):10850–10869, 2023. 2
work page 2023
-
[6]
Generative dif- fusion prior for unified image restoration and enhancement
Ben Fei, Zhaoyang Lyu, Liang Pan, Junzhe Zhang, Weidong Yang, Tianyue Luo, Bo Zhang, and Bo Dai. Generative dif- fusion prior for unified image restoration and enhancement. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9935–9946, 2023. 3
work page 2023
-
[7]
Chunming He, Yuqi Shen, Chengyu Fang, Fengyang Xiao, Longxiang Tang, Yulun Zhang, Wangmeng Zuo, Zhenhua Guo, and Xiu Li. Diffusion models in low-level vision: A survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. 2
work page 2025
-
[8]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffu- sion probabilistic models.Advances in Neural Information Processing Systems, 33:6840–6851, 2020. 2
work page 2020
-
[9]
Dednet: Infrared and visible image fusion with noise removal by decomposition-driven network
Jingxue Huang, Xiaosong Li, Haishu Tan, Lemiao Yang, Gao Wang, and Peng Yi. Dednet: Infrared and visible image fusion with noise removal by decomposition-driven network. Measurement, 237:115092, 2024. 1
work page 2024
-
[10]
Guofa Li, Yongjie Lin, and Xingda Qu. An infrared and visible image fusion method based on multi-scale transfor- mation and norm optimization.Information Fusion, 71:109– 129, 2021. 2
work page 2021
-
[11]
Huafeng Li, Xiaoge He, Dapeng Tao, Yuanyan Tang, and Ruxin Wang. Joint medical image fusion, denoising and enhancement via discriminative low-rank sparse dictionaries learning.Pattern Recognition, 79:130–146, 2018. 1
work page 2018
-
[12]
Huafeng Li, Yueliang Cen, Yu Liu, Xun Chen, and Zhengtao Yu. Different input resolutions and arbitrary output resolu- tion: A meta learning-based deep framework for infrared and visible image fusion.IEEE Transactions on Image Process- ing, 30:4070–4083, 2021. 1
work page 2021
-
[13]
Huafeng Li, Zengyi Yang, Yafei Zhang, Wei Jia, Zheng- tao Yu, and Yu Liu. Mulfs-cap: Multimodal fusion- supervised cross-modality alignment perception for unreg- istered infrared-visible image fusion.IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(5):3673– 3690, 2025. 2
work page 2025
-
[14]
Liangliang Li, Yan Shi, Ming Lv, Zhenhong Jia, Minqin Liu, Xiaobin Zhao, Xueyu Zhang, and Hongbing Ma. Infrared and visible image fusion via sparse representation and guided filtering in laplacian pyramid domain.Remote Sensing, 16 (20):3804, 2024. 2
work page 2024
-
[15]
Mining Li, Ronghao Pei, Tianyou Zheng, Yang Zhang, and Weiwei Fu. Fusiondiff: Multi-focus image fusion using de- noising diffusion probabilistic models.Expert Systems with Applications, 238:121664, 2024. 2
work page 2024
-
[16]
Xiaosong Li, Fuqiang Zhou, and Haishu Tan. Joint image fu- sion and denoising via three-layer decomposition and sparse representation.Knowledge-Based Systems, 224:107087,
-
[17]
Contourlet residual for prompt learning enhanced infrared image super-resolution
Xingyuan Li, Jinyuan Liu, Zhixin Chen, Yang Zou, Long Ma, Xin Fan, and Risheng Liu. Contourlet residual for prompt learning enhanced infrared image super-resolution. InEuropean Conference on Computer Vision, pages 270–
-
[18]
Difiisr: A diffu- sion model with gradient guidance for infrared image super- resolution
Xingyuan Li, Zirui Wang, Yang Zou, Zhixin Chen, Jun Ma, Zhiying Jiang, Long Ma, and Jinyuan Liu. Difiisr: A diffu- sion model with gradient guidance for infrared image super- resolution. InProceedings of the Computer Vision and Pat- tern Recognition Conference, pages 7534–7544, 2025. 3
work page 2025
-
[19]
Jinyuan Liu, Xin Fan, Zhanbo Huang, Guanyao Wu, Risheng Liu, Wei Zhong, and Zhongxuan Luo. Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5802–5811, 2022. 5
work page 2022
-
[20]
Jinyang Liu, Shutao Li, Haibo Liu, Renwei Dian, and Xiao- hui Wei. A lightweight pixel-level unified image fusion net- work.IEEE Transactions on Neural Networks and Learning Systems, 2023. 2
work page 2023
-
[21]
Jinyuan Liu, Runjia Lin, Guanyao Wu, Risheng Liu, Zhongxuan Luo, and Xin Fan. Coconet: Coupled con- trastive learning network with multi-level feature ensemble for multi-modality image fusion.International Journal of Computer Vision, 132(5):1748–1775, 2024. 2
work page 2024
-
[22]
Risheng Liu, Zhu Liu, Jinyuan Liu, Xin Fan, and Zhongxuan Luo. A task-guided, implicitly-searched and meta-initialized deep model for image fusion.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 46(10):6594–6609,
-
[23]
Yu Liu and Zengfu Wang. Simultaneous image fusion and denoising with adaptive sparse representation.IET Image Processing, 9(5):347–357, 2015. 2
work page 2015
-
[24]
Yu Liu, Shuping Liu, and Zengfu Wang. A general frame- work for image fusion based on multi-scale transform and sparse representation.Information Fusion, 24:147–164,
-
[25]
Yu Liu, Xun Chen, Rabab K Ward, and Z Jane Wang. Image fusion with convolutional sparse representation.IEEE Signal Processing Letters, 23(12):1882–1886, 2016. 2
work page 2016
-
[26]
Yu Liu, Xun Chen, Hu Peng, and Zengfu Wang. Multi-focus image fusion with a deep convolutional neural network.In- formation Fusion, 36:191–207, 2017. 2
work page 2017
-
[27]
Yu Liu, Yu Shi, Fuhao Mu, Juan Cheng, and Xun Chen. Glioma segmentation-oriented multi-modal mr image fusion with adversarial learning.IEEE/CAA Journal of Automatica Sinica, 9(8):1528–1531, 2022. 2
work page 2022
-
[28]
Yu Liu, Chen Yu, Juan Cheng, Z Jane Wang, and Xun Chen. Mm-net: A mixformer-based multi-scale network for anatomical and functional image fusion.IEEE Transactions on Image Processing, 33:2197–2212, 2024. 2
work page 2024
-
[29]
Jiayi Ma, Han Xu, Junjun Jiang, Xiaoguang Mei, and Xiao- Ping Zhang. Ddcgan: A dual-discriminator conditional gen- erative adversarial network for multi-resolution image fu- sion.IEEE Transactions on Image Processing, 29:4980– 4995, 2020. 2
work page 2020
-
[30]
Chengyi Pan, Xiuliang Xi, Xin Jin, Huangqimei Zheng, Puming Wang, and Qiang Jiang. Dif-gan: A generative adversarial network with multi-scale attention and diffusion models for infrared-visible image fusion. In2024 IEEE In- ternational Symposium on Parallel and Distributed Process- ing with Applications (ISPA), pages 1960–1967. IEEE, 2024. 2
work page 1960
-
[31]
Yu Shi, Yu Liu, Juan Cheng, Z Jane Wang, and Xun Chen. Vdmufusion: A versatile diffusion model-based unsuper- vised framework for image fusion.IEEE Transactions on Image Processing, 2024. 2, 6
work page 2024
-
[32]
Yu Shi, Yu Liu, Juan Cheng, Huafeng Li, and Xun Chen. Semantic-guided diffusion sampling: A generalized strategy for enhancing object segmentation oriented multimodal im- age fusion.IEEE Journal of Selected Topics in Signal Pro- cessing, pages 1–13, 2025. 2
work page 2025
-
[33]
Drmf: Degradation-robust multi- modal image fusion via composable diffusion prior
Linfeng Tang, Yuxin Deng, Xunpeng Yi, Qinglong Yan, Yix- uan Yuan, and Jiayi Ma. Drmf: Degradation-robust multi- modal image fusion via composable diffusion prior. InPro- ceedings of the 32nd ACM International Conference on Mul- timedia, pages 8546–8555, 2024. 2
work page 2024
-
[34]
Linfeng Tang, Chunyu Li, and Jiayi Ma. Mask-difuser: A masked diffusion model for unified unsupervised image fu- sion.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. 6
work page 2025
-
[35]
Lidia: Lightweight learned image denoising with instance adaptation
Gregory Vaksman, Michael Elad, and Peyman Milanfar. Lidia: Lightweight learned image denoising with instance adaptation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 524–525, 2020. 6
work page 2020
-
[36]
Rap-sr: restoration prior enhance- ment in diffusion models for realistic image super-resolution
Jiangang Wang, Qingnan Fan, Jinwei Chen, Hong Gu, Feng Huang, and Wenqi Ren. Rap-sr: restoration prior enhance- ment in diffusion models for realistic image super-resolution. InProceedings of the AAAI Conference on Artificial Intelli- gence, pages 7727–7735, 2025. 3
work page 2025
-
[37]
Xiangxiang Wang, Lixing Fang, Junli Zhao, Zhenkuan Pan, Hui Li, and Yi Li. Uud-fusion: An unsupervised universal image fusion approach via generative diffusion model.Com- puter Vision and Image Understanding, 249:104218, 2024. 2
work page 2024
-
[38]
Yinhuai Wang, Jiwen Yu, and Jian Zhang. Zero-shot im- age restoration using denoising diffusion null-space model. arXiv preprint arXiv:2212.00490, 2022. 3, 4, 5
-
[39]
Sinsr: diffusion-based image super- resolution in a single step
Yufei Wang, Wenhan Yang, Xinyuan Chen, Yaohui Wang, Lanqing Guo, Lap-Pui Chau, Ziwei Liu, Yu Qiao, Alex C Kot, and Bihan Wen. Sinsr: diffusion-based image super- resolution in a single step. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 25796–25805, 2024. 3
work page 2024
-
[40]
Unfusion: A unified multi-scale densely connected network for infrared and visible image fusion
Zhishe Wang, Junyao Wang, Yuanyuan Wu, Jiawei Xu, and Xiaoqin Zhang. Unfusion: A unified multi-scale densely connected network for infrared and visible image fusion. IEEE Transactions on Circuits and Systems for Video Tech- nology, 32(6):3360–3374, 2021. 2
work page 2021
-
[41]
Dr2: Diffusion-based robust degradation remover for blind face restoration
Zhixin Wang, Ziying Zhang, Xiaoyun Zhang, Huangjie Zheng, Mingyuan Zhou, Ya Zhang, and Yanfeng Wang. Dr2: Diffusion-based robust degradation remover for blind face restoration. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1704– 1713, 2023. 3
work page 2023
-
[42]
Efficient rectified flow for image fusion.arXiv preprint arXiv:2509.16549, 2025
Zirui Wang, Jiayi Zhang, Tianwei Guan, Yuhan Zhou, Xingyuan Li, Minjing Dong, and Jinyuan Liu. Efficient recti- fied flow for image fusion.arXiv preprint arXiv:2509.16549,
-
[43]
Diffir: Efficient diffusion model for image restoration
Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xing- long Wu, Yapeng Tian, Wenming Yang, and Luc Van Gool. Diffir: Efficient diffusion model for image restoration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 13095–13105, 2023. 3
work page 2023
-
[44]
Xinyu Xie, Yawen Cui, Tao Tan, Xubin Zheng, and Zitong Yu. Fusionmamba: Dynamic feature enhancement for mul- timodal image fusion with mamba.Visual Intelligence, 2(1): 37, 2024. 2
work page 2024
-
[45]
Han Xu and Jiayi Ma. Emfusion: An unsupervised enhanced medical image fusion network.Information Fusion, 76:177– 186, 2021. 2
work page 2021
-
[46]
Han Xu, Jiayi Ma, Junjun Jiang, Xiaojie Guo, and Haibin Ling. U2fusion: A unified unsupervised image fusion net- work.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(1):502–518, 2020. 6
work page 2020
-
[47]
Han Xu, Jiteng Yuan, and Jiayi Ma. Murf: Mutually re- inforcing multi-modal image registration and fusion.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(10):12148–12166, 2023. 6
work page 2023
-
[48]
Simultaneous tri-modal medical image fusion and super- resolution using conditional diffusion model
Yushen Xu, Xiaosong Li, Yuchan Jie, and Haishu Tan. Simultaneous tri-modal medical image fusion and super- resolution using conditional diffusion model. InInter- national Conference on Medical Image Computing and Computer-Assisted Intervention, pages 635–645. Springer,
-
[49]
Yushen Xu, Xiaosong Li, Yuchun Wang, Xiaoqi Cheng, Huafeng Li, and Haishu Tan. Flexid-fuse: Flexible number of inputs multi-modal medical image fusion based on diffu- sion model.Expert Systems with Applications, page 128895,
-
[50]
Bo Yang, Zhaohui Jiang, Dong Pan, Haoyang Yu, Gui Gui, and Weihua Gui. Lfdt-fusion: A latent feature-guided diffu- sion transformer model for general image fusion.Informa- tion Fusion, 113:102639, 2025. 2
work page 2025
-
[51]
Zengyi Yang, Yafei Zhang, Huafeng Li, and Yu Liu. Instruction-driven fusion of infrared–visible images: Tailor- ing for diverse downstream tasks.Information Fusion, 121: 103148, 2025. 2
work page 2025
-
[52]
Xunpeng Yi, Linfeng Tang, Hao Zhang, Han Xu, and Ji- ayi Ma. Diff-if: Multi-modality image fusion via diffusion model with fusion knowledge prior.Information Fusion, 110:102450, 2024. 2
work page 2024
-
[53]
Text-if: Leveraging semantic text guidance for degradation-aware and interactive image fusion
Xunpeng Yi, Han Xu, Hao Zhang, Linfeng Tang, and Ji- ayi Ma. Text-if: Leveraging semantic text guidance for degradation-aware and interactive image fusion. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 27026–27035, 2024. 2
work page 2024
-
[54]
Simultaneous im- age fusion and super-resolution using sparse representation
Haitao Yin, Shutao Li, and Leyuan Fang. Simultaneous im- age fusion and super-resolution using sparse representation. Information Fusion, 14(3):229–240, 2013. 1
work page 2013
-
[55]
Jun Yue, Leyuan Fang, Shaobo Xia, Yue Deng, and Jiayi Ma. Dif-fusion: Toward high color fidelity in infrared and visible image fusion with diffusion models.IEEE Transactions on Image Processing, 32:5705–5720, 2023. 2
work page 2023
-
[56]
Restormer: Efficient transformer for high-resolution image restoration
Syed Waqas Zamir, Aditya Arora, Salman Khan, Mu- nawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image restoration. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5728– 5739, 2022. 6
work page 2022
-
[57]
Hao Zhang, Lei Cao, and Jiayi Ma. Text-difuse: An inter- active multi-modal image fusion framework based on text- modulated diffusion model.Advances in Neural Information Processing Systems, 37:39552–39572, 2024. 6
work page 2024
-
[58]
Leheng Zhang, Yawei Li, Xingyu Zhou, Xiaorui Zhao, and Shuhang Gu. Transcending the limit of local window: Advanced super-resolution transformer with adaptive token dictionary. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2856– 2865, 2024. 6
work page 2024
-
[59]
Yu Zhang, Yu Liu, Peng Sun, Han Yan, Xiaolin Zhao, and Li Zhang. Ifcnn: A general image fusion framework based on convolutional neural network.Information Fusion, 54: 99–118, 2020. 6
work page 2020
-
[60]
Cddfuse: Correlation-driven dual-branch feature decompo- sition for multi-modality image fusion
Zixiang Zhao, Haowen Bai, Jiangshe Zhang, Yulun Zhang, Shuang Xu, Zudi Lin, Radu Timofte, and Luc Van Gool. Cddfuse: Correlation-driven dual-branch feature decompo- sition for multi-modality image fusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5906–5916, 2023. 2
work page 2023
-
[61]
Ddfm: denoising diffusion model for multi-modality image fusion
Zixiang Zhao, Haowen Bai, Yuanzhi Zhu, Jiangshe Zhang, Shuang Xu, Yulun Zhang, Kai Zhang, Deyu Meng, Radu Timofte, and Luc Van Gool. Ddfm: denoising diffusion model for multi-modality image fusion. InProceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages 8082–8093, 2023. 2, 6
work page 2023
-
[62]
Equivariant multi-modality image fusion
Zixiang Zhao, Haowen Bai, Jiangshe Zhang, Yulun Zhang, Kai Zhang, Shuang Xu, Dongdong Chen, Radu Timofte, and Luc Van Gool. Equivariant multi-modality image fusion. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 25912–25921, 2024. 2
work page 2024
-
[63]
A general spatial- frequency learning framework for multimodal image fusion
Man Zhou, Jie Huang, Keyu Yan, Danfeng Hong, Xiuping Jia, Jocelyn Chanussot, and Chongyi Li. A general spatial- frequency learning framework for multimodal image fusion. IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 2024. 2
work page 2024
-
[64]
Chunyu Zhu, Shangqi Deng, Xuan Song, Yachao Li, and Qi Wang. Mamba collaborative implicit neural representation for hyperspectral and multispectral remote sensing image fu- sion.IEEE Transactions on Geoscience and Remote Sensing,
-
[65]
2 Degradation-Robust Fusion: An Efficient Degradation-Aware Diffusion Framework for Multimodal Image Fusion in Arbitrary Degradation Scenarios Supplementary Material A. Diffusion Models A.1. Denoising Diffusion Probabilistic Models Denoising Diffusion Probabilistic Models (DDPM) are a class of generative models that rely on a forward process of gradually ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.