Balanced Diffusion-Guided Fusion for Multimodal Remote Sensing Classification
Pith reviewed 2026-05-18 12:37 UTC · model grok-4.3
The pith
A balanced diffusion-guided fusion framework uses modality-masked diffusion features to hierarchically guide a multi-branch network and improve multimodal remote sensing classification.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that their balanced diffusion-guided fusion framework addresses modality imbalance in pre-trained multimodal DDPMs through an adaptive modality masking strategy, enabling the resulting diffusion features to hierarchically guide feature extraction in a multi-branch network incorporating CNN, Mamba, and transformer components via feature fusion, group channel attention, and cross-attention mechanisms, while a mutual learning strategy aligns the branches by matching probability entropy and feature similarity, ultimately delivering superior classification performance across four multimodal remote sensing datasets.
What carries the argument
The balanced diffusion-guided fusion (BDGF) framework, which applies adaptive modality masking to DDPMs for balanced features and uses those features to hierarchically guide a multi-branch network through fusion, attention, and mutual learning.
Load-bearing premise
The adaptive modality masking strategy successfully produces a modality-balanced data distribution in the DDPMs without discarding critical complementary information from any sensor.
What would settle it
Running the full classification experiments on the four multimodal datasets both with and without the adaptive modality masking step and checking for a significant drop in accuracy when masking is removed would test whether the balancing is essential to the gains.
Figures
read the original abstract
Deep learning-based techniques for the analysis of multimodal remote sensing data have become popular due to their ability to effectively integrate complementary spatial, spectral, and structural information from different sensors. Recently, denoising diffusion probabilistic models (DDPMs) have attracted attention in the remote sensing community due to their powerful ability to capture robust and complex spatial-spectral distributions. However, pre-training multimodal DDPMs may result in modality imbalance, and effectively leveraging diffusion features to guide complementary diversity feature extraction remains an open question. To address these issues, this paper proposes a balanced diffusion-guided fusion (BDGF) framework that leverages multimodal diffusion features to guide a multi-branch network for land-cover classification. Specifically, we propose an adaptive modality masking strategy to encourage the DDPMs to obtain a modality-balanced rather than spectral image-dominated data distribution. Subsequently, these diffusion features hierarchically guide feature extraction among CNN, Mamba, and transformer networks by integrating feature fusion, group channel attention, and cross-attention mechanisms. Finally, a mutual learning strategy is developed to enhance inter-branch collaboration by aligning the probability entropy and feature similarity of individual subnetworks. Extensive experiments on four multimodal remote sensing datasets demonstrate that the proposed method achieves superior classification performance. The code is available at https://github.com/HaoLiu-XDU/BDGF.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to introduce a Balanced Diffusion-Guided Fusion (BDGF) framework for multimodal remote sensing land-cover classification. It proposes an adaptive modality masking strategy to pre-train multimodal DDPMs for a modality-balanced data distribution rather than spectral-dominated, then uses the resulting diffusion features to hierarchically guide feature extraction in a multi-branch network (CNN, Mamba, transformer) via feature fusion, group channel attention, and cross-attention. A mutual learning strategy aligns probability entropy and feature similarity across branches. Superior classification performance is reported on four public multimodal remote sensing datasets, with code released.
Significance. If the central claims hold, the work provides a timely engineering contribution by explicitly addressing modality imbalance in diffusion pre-training for remote sensing and combining it with recent architectures like Mamba for hierarchical guidance. The public code release is a clear strength supporting reproducibility. The significance hinges on whether the balanced diffusion features deliver gains beyond what the multi-branch architecture alone would achieve.
major comments (2)
- [Method (adaptive modality masking)] The adaptive modality masking strategy (described in the abstract and method) is load-bearing for the claim that balanced diffusion features drive the performance gains. No quantitative validation is provided (e.g., per-modality loss statistics, distribution histograms, or ablation comparing masked vs. unmasked pre-training) to confirm that masking equalizes modality influence without discarding critical complementary spectral or structural information from any sensor. If masking trades off unique features, the downstream results could be explained by the multi-branch network alone.
- [§4] §4 (experiments): The manuscript reports superior performance on four datasets but provides no error bars, statistical significance tests, or comprehensive ablation studies isolating the contribution of the masking strategy, hierarchical diffusion guidance, and mutual learning. This weakens verification of the central claim that the balanced diffusion features are responsible for the improvements.
minor comments (2)
- [Abstract] The abstract could briefly name the four datasets and report one or two key quantitative metrics (e.g., overall accuracy gains) to strengthen the summary of results.
- [Notation and method] Ensure consistent notation for the diffusion features and attention modules across sections; minor inconsistencies in variable naming could confuse readers.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback on our manuscript. We have carefully reviewed the major comments and will incorporate revisions to strengthen the validation of the adaptive modality masking strategy and enhance the experimental analysis with additional statistical measures and ablations.
read point-by-point responses
-
Referee: [Method (adaptive modality masking)] The adaptive modality masking strategy (described in the abstract and method) is load-bearing for the claim that balanced diffusion features drive the performance gains. No quantitative validation is provided (e.g., per-modality loss statistics, distribution histograms, or ablation comparing masked vs. unmasked pre-training) to confirm that masking equalizes modality influence without discarding critical complementary spectral or structural information from any sensor. If masking trades off unique features, the downstream results could be explained by the multi-branch network alone.
Authors: We appreciate the referee's emphasis on this point, as the adaptive modality masking is central to our claim of achieving modality-balanced diffusion features. The current manuscript describes the strategy in detail and integrates it into the pre-training process to mitigate spectral dominance. However, we acknowledge that explicit quantitative evidence, such as per-modality loss curves or histograms showing balanced influence, would further substantiate that critical complementary information is preserved. In the revised manuscript, we will add these analyses along with an ablation comparing masked versus unmasked pre-training on the downstream classification task. This will demonstrate that the masking equalizes modality contributions without discarding unique spectral or structural details from any sensor. revision: yes
-
Referee: [§4] §4 (experiments): The manuscript reports superior performance on four datasets but provides no error bars, statistical significance tests, or comprehensive ablation studies isolating the contribution of the masking strategy, hierarchical diffusion guidance, and mutual learning. This weakens verification of the central claim that the balanced diffusion features are responsible for the improvements.
Authors: We agree that the experimental section would benefit from greater statistical rigor and more targeted ablations to isolate each component's contribution. The current results demonstrate consistent superiority across four datasets, but we recognize the value of error bars from multiple runs and significance testing to confirm the gains are not due to the multi-branch architecture alone. In the revised version, we will report mean and standard deviation over multiple independent runs, include statistical significance tests (such as paired t-tests against baselines), and expand the ablation studies to separately evaluate the adaptive modality masking, hierarchical diffusion guidance mechanisms, and mutual learning strategy. These additions will more clearly attribute performance improvements to the balanced diffusion features. revision: yes
Circularity Check
No significant circularity; framework validated on external public datasets with released code
full rationale
The paper presents BDGF as an engineering framework: adaptive modality masking during DDPM pre-training, hierarchical diffusion-feature guidance via fusion/attention mechanisms across CNN/Mamba/transformer branches, and mutual learning via entropy/similarity alignment. All central performance claims are tied to empirical results on four independent public multimodal remote sensing datasets rather than any equation or parameter that reduces by construction to the inputs. No self-definitional loops, fitted-input predictions, or load-bearing self-citations appear in the derivation; the method is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we propose an adaptive modality masking strategy to encourage the DDPMs to obtain a modality-balanced rather than spectral image-dominated data distribution
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
these diffusion features hierarchically guide feature extraction among CNN, Mamba, and transformer networks by integrating feature fusion, group channel attention, and cross-attention mechanisms
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Deep learning in remote sensing image fusion: Methods, protocols, data, and future perspectives,
G. Vivone, L.-J. Deng, S. Deng, D. Hong, M. Jiang, C. Li, W. Li, H. Shen, X. Wu, J.-L. Xiao, J. Yao, M. Zhang, J. Chanussot, S. Garc ´ıa, and A. Plaza, “Deep learning in remote sensing image fusion: Methods, protocols, data, and future perspectives,”IEEE Geosci. Remote Sens. Mag., vol. 13, no. 1, pp. 269–310, Mar. 2025
work page 2025
-
[2]
Spatial–spectral heterogeneity-aware network for hyperspectral and lidar joint classification,
S. Zhang, Q. Liu, Z. Zhang, R. Zhao, L. Chen, F. Shao, and X. Meng, “Spatial–spectral heterogeneity-aware network for hyperspectral and lidar joint classification,”IEEE Trans. Neural Netw. Learn. Syst., pp. 1–15, Jun. 2025
work page 2025
-
[3]
C. He, B. Gao, Q. Huang, Q. Ma, and Y . Dou, “Environmental degradation in the urban areas of china: Evidence from multi-source remote sensing data,”Remote Sens. Environ, vol. 193, pp. 65–75, Mar. 2017
work page 2017
-
[4]
Multisource remote sensing classification for coastal wetland using feature intersecting learning,
Z. Han, Y . Gao, X. Jiang, J. Wang, and W. Li, “Multisource remote sensing classification for coastal wetland using feature intersecting learning,”IEEE Geosci. Remote Sens. Lett., vol. 19, pp. 1–5, Mar. 2022
work page 2022
-
[5]
A new fusion approach for extracting urban built-up areas from multisource remotely sensed data,
X. Ma, C. Li, X. Tong, and S. Liu, “A new fusion approach for extracting urban built-up areas from multisource remotely sensed data,”Remote Sens., vol. 11, no. 21, p. 2516, Oct. 2019
work page 2019
-
[6]
Multi-source remote sensing data fusion: status and trends,
J. Zhang, “Multi-source remote sensing data fusion: status and trends,” Int. J. Image Data fusion, vol. 1, no. 1, pp. 5–24, Feb. 2010
work page 2010
-
[7]
Fusion of hyper- spectral and lidar remote sensing data using multiple feature learning,
M. Khodadadzadeh, J. Li, S. Prasad, and A. Plaza, “Fusion of hyper- spectral and lidar remote sensing data using multiple feature learning,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 8, no. 6, pp. 2971–2983, Jun. 2015
work page 2015
-
[8]
Relationship learning from multisource images via spatial-spectral perception network,
Y . Gao, W. Li, J. Wang, M. Zhang, and R. Tao, “Relationship learning from multisource images via spatial-spectral perception network,”IEEE Trans. Image Process., vol. 33, pp. 3271–3284, May 2024
work page 2024
-
[9]
More diverse means better: Multimodal deep learning meets remote- sensing imagery classification,
D. Hong, L. Gao, N. Yokoya, J. Yao, J. Chanussot, Q. Du, and B. Zhang, “More diverse means better: Multimodal deep learning meets remote- sensing imagery classification,”IEEE Trans. Geosci. and Remote Sens., vol. 59, no. 5, pp. 4340–4354, Aug. 2021
work page 2021
-
[10]
Convolutional neural networks for multimodal remote sensing data classification,
X. Wu, D. Hong, and J. Chanussot, “Convolutional neural networks for multimodal remote sensing data classification,”IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–10, Feb. 2022
work page 2022
-
[11]
Y . Gao, X. Song, W. Li, J. Wang, J. He, X. Jiang, and Y . Feng, “Fusion classification of hsi and msi using a spatial-spectral vision transformer for wetland biodiversity estimation,”Remote Sens., vol. 14, no. 4, p. 850, Feb. 2022
work page 2022
-
[12]
Deep hierarchical vision transformer for hyperspectral and lidar data classification,
Z. Xue, X. Tan, X. Yu, B. Liu, A. Yu, and P. Zhang, “Deep hierarchical vision transformer for hyperspectral and lidar data classification,”IEEE Trans. Image Process., vol. 31, pp. 3095–3110, Apr. 2022
work page 2022
-
[13]
Multimodal fusion transformer for remote sensing image classification,
S. K. Roy, A. Deria, D. Hong, B. Rasti, A. Plaza, and J. Chanussot, “Multimodal fusion transformer for remote sensing image classification,” IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–20, Jun. 2023
work page 2023
-
[14]
Mutually beneficial transformer for multimodal data fusion,
J. Wang and X. Tan, “Mutually beneficial transformer for multimodal data fusion,”IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 12, pp. 7466–7479, May 2023
work page 2023
-
[15]
W.-S. Hu, W. Li, H.-C. Li, F.-H. Huang, and R. Tao, “Global clue- guided cross-memory quaternion transformer network for multisource remote sensing data classification,”IEEE Trans. Neural Netw. Learn. Syst., pp. 1–15, Jun. 2024
work page 2024
-
[16]
Multimodal quaternion representation network for multisource remote sensing data classification,
Y .-L. Wei, H.-C. Li, J.-L. Wang, Y .-B. Zheng, J. Pan, and Q. Du, “Multimodal quaternion representation network for multisource remote sensing data classification,”IEEE Trans. Neural Netw. Learn. Syst., pp. 1–15, Sep. 2025
work page 2025
-
[17]
Mhst: Multiscale head selection transformer for hyperspectral and lidar classification,
K. Ni, D. Wang, Z. Zheng, and P. Wang, “Mhst: Multiscale head selection transformer for hyperspectral and lidar classification,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 17, pp. 5470–5483, Feb. 2024
work page 2024
-
[18]
Ncglf2: Network combining global and local features for fusion of multisource remote sensing data,
B. Tu, Q. Ren, J. Li, Z. Cao, Y . Chen, and A. Plaza, “Ncglf2: Network combining global and local features for fusion of multisource remote sensing data,”Inf. Fusion, vol. 104, p. 102192, Apr. 2024
work page 2024
-
[19]
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,”arXiv preprint arXiv:2312.00752, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[20]
A mamba- diffusion framework for multimodal remote sensing image semantic segmentation,
W.-L. Du, Y . Gu, J. Zhao, H. Zhu, R. Yao, and Y . Zhou, “A mamba- diffusion framework for multimodal remote sensing image semantic segmentation,”IEEE Geosci. and Remote Sens. Lett., vol. 21, pp. 1– 5, Oct. 2024
work page 2024
-
[21]
S2crossmamba: Spatial–spectral cross-mamba for multimodal remote sensing image classification,
G. Zhang, Z. Zhang, J. Deng, L. Bian, and C. Yang, “S2crossmamba: Spatial–spectral cross-mamba for multimodal remote sensing image classification,”IEEE Geosci. Remote Sens. Lett., vol. 21, pp. 1–5, Oct. 2024
work page 2024
-
[22]
F. Gao, X. Jin, X. Zhou, J. Dong, and Q. Du, “Msfmamba: Multiscale feature fusion state space model for multisource remote sensing image classification,”IEEE Trans. Geosci. Remote Sens., vol. 63, pp. 1–16, Jan. 2025
work page 2025
-
[23]
Mlmamba: A mamba- based efficient network for multi-label remote sensing scene classifica- tion,
R. Du, X. Tang, J. Ma, X. Zhang, and L. Jiao, “Mlmamba: A mamba- based efficient network for multi-label remote sensing scene classifica- tion,”IEEE Trans. Circuits Syst. Video Technol., pp. 1–1, Jan. 2025
work page 2025
-
[24]
Joint classification of hyperspectral and lidar data based on mamba,
D. Liao, Q. Wang, T. Lai, and H. Huang, “Joint classification of hyperspectral and lidar data based on mamba,”IEEE Trans. Geosci. Remote Sens., vol. 62, pp. 1–15, Oct. 2024
work page 2024
-
[25]
Distribution-independent domain generalization for multisource remote sensing classification,
Y . Gao, M. Zhang, W. Li, and R. Tao, “Distribution-independent domain generalization for multisource remote sensing classification,” IEEE Trans. Neural Netw. Learn. Syst., vol. 36, no. 7, Jul. 2025
work page 2025
-
[26]
M. Ahmad, S. Distifano, A. M. Khan, M. Mazzara, C. Li, H. Li, J. Aryal, Y . Ding, G. Vivone, and D. Hong, “A comprehensive survey for hyperspectral image classification: The evolution from conventional to transformers and mamba models,”arXiv preprint arXiv:2404.14955, 2024
-
[27]
Diffusion models meet remote sensing: Principles, methods, and perspectives,
Y . Liu, J. Yue, S. Xia, P. Ghamisi, W. Xie, and L. Fang, “Diffusion models meet remote sensing: Principles, methods, and perspectives,” IEEE Trans. Geosci. Remote Sens., vol. 62, pp. 1–22, Sep. 2024
work page 2024
-
[28]
Denoising diffusion probabilistic models,
J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” inProc. Adv. Neural. Inf. Process. Syst., vol. 33, Dec. 2020, pp. 6840– 6851
work page 2020
-
[29]
Dif- fusion models beat gans on image classification,
S. Mukhopadhyay, M. Gwilliam, V . Agarwal, N. Padmanabhan, A. Swaminathan, S. Hegde, T. Zhou, and A. Shrivastava, “Dif- fusion models beat gans on image classification,”arXiv preprint arXiv:2307.08702, 2023
-
[30]
Diffusion subspace clustering for hyperspectral images,
J. Chen, S. Liu, Z. Zhang, and H. Wang, “Diffusion subspace clustering for hyperspectral images,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 16, pp. 6517–6530, Jul. 2023
work page 2023
-
[31]
Spectraldiff: A generative framework for hyperspectral image classification with diffusion models,
N. Chen, J. Yue, L. Fang, and S. Xia, “Spectraldiff: A generative framework for hyperspectral image classification with diffusion models,” IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–16, Aug. 2023
work page 2023
-
[32]
N. Sigger, Q.-T. Vien, S. V . Nguyen, G. Tozzi, and T. T. Nguyen, “Unveiling the potential of diffusion model-based framework with transformer for hyperspectral image classification,”Sci. Rep., vol. 14, no. 1, p. 8438, Apr. 2024
work page 2024
-
[33]
Exploring multi-timestep multi-stage diffusion features for hyperspectral image classification,
J. Zhou, J. Sheng, P. Ye, J. Fan, T. He, B. Wang, and T. Chen, “Exploring multi-timestep multi-stage diffusion features for hyperspectral image classification,”IEEE Trans. Geosci. Remote Sens., vol. 62, pp. 1–16, May 2024. 14
work page 2024
-
[34]
Dect: Diffusion-enhanced cnn–transformer for multisource remote sensing data classification,
G. Zhang, L. Zhang, Z. Zhang, J. Deng, L. Bian, and C. Yang, “Dect: Diffusion-enhanced cnn–transformer for multisource remote sensing data classification,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 17, pp. 19 288–19 301, Oct. 2024
work page 2024
-
[35]
Ss-mae: Spatial–spectral masked autoencoder for multisource remote sensing image classifica- tion,
J. Lin, F. Gao, X. Shi, J. Dong, and Q. Du, “Ss-mae: Spatial–spectral masked autoencoder for multisource remote sensing image classifica- tion,”IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–14, Nov. 2023
work page 2023
-
[36]
Moddrop: Adaptive multi-modal gesture recognition,
N. Neverova, C. Wolf, G. Taylor, and F. Nebout, “Moddrop: Adaptive multi-modal gesture recognition,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 8, pp. 1692–1706, Aug. 2016
work page 2016
-
[37]
What makes training multi-modal classification networks hard?
W. Wang, D. Tran, and M. Feiszli, “What makes training multi-modal classification networks hard?” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020
work page 2020
-
[38]
On-the-fly modulation for balanced multimodal learning,
Y . Wei, D. Hu, H. Du, and J.-R. Wen, “On-the-fly modulation for balanced multimodal learning,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 47, no. 1, pp. 469–485, Jan. 2025
work page 2025
-
[39]
Siamese meets diffusion network: Smdnet for enhanced change detection in high-resolution rs imagery,
J. Jia, G. Lee, Z. Wang, L. Zhi, and Y . He, “Siamese meets diffusion network: Smdnet for enhanced change detection in high-resolution rs imagery,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., 2024
work page 2024
-
[40]
X. Zhang, S. Tian, G. Wang, H. Zhou, and L. Jiao, “Diffucd: Unsuper- vised hyperspectral image change detection with semantic correlation diffusion model,”arXiv preprint arXiv:2305.12410, 2023
-
[41]
Mifnet: Learning modality-invariant features for generalizable multimodal image matching,
Y . Liu, Z. Sun, B. Yu, Y . Zhao, B. Du, Y . Xu, and J. Cheng, “Mifnet: Learning modality-invariant features for generalizable multimodal image matching,”IEEE Trans. Image Process., vol. 34, pp. 3593–3608, Jan. 2025
work page 2025
-
[42]
Diffusiondet: Diffusion model for object detection,
S. Chen, P. Sun, Y . Song, and P. Luo, “Diffusiondet: Diffusion model for object detection,” inProc. Int. Conf. Comput. Vis. (ICCV), 2023, pp. 19 830–19 843
work page 2023
-
[43]
Deep learning in multimodal remote sensing data fusion: A compre- hensive review,
J. Li, D. Hong, L. Gao, J. Yao, K. Zheng, B. Zhang, and J. Chanussot, “Deep learning in multimodal remote sensing data fusion: A compre- hensive review,”Int. J. Appl. Earth Observ. Geoinf., vol. 112, p. 102926, Aug. 2022
work page 2022
-
[44]
Remote sensing scene classification via multi-branch local attention network,
S.-B. Chen, Q.-S. Wei, W.-Z. Wang, J. Tang, B. Luo, and Z.-Y . Wang, “Remote sensing scene classification via multi-branch local attention network,”IEEE Trans. Image Process., vol. 31, pp. 99–109, Nov. 2021
work page 2021
-
[45]
Speckle analysis and smoothing of synthetic aperture radar images,
J.-S. Lee, “Speckle analysis and smoothing of synthetic aperture radar images,”Comput. Graph. Image Process., vol. 17, no. 1, pp. 24–32, 1981
work page 1981
-
[46]
J. Wang, M. Zhang, W. Li, and R. Tao, “A multistage information complementary fusion network based on flexible-mixup for hsi-x image classification,”IEEE Trans. Neural Netw. Learn. Syst., vol. 35, no. 12, pp. 17 189–17 201, Dec. 2024
work page 2024
-
[47]
A hybrid multi-task learning network for hyperspectral image classification with few labels,
H. Liu, M. Zhang, Z. Di, M. Gong, T. Gao, and A. K. Qin, “A hybrid multi-task learning network for hyperspectral image classification with few labels,”IEEE Trans. Geosci. Remote Sens., vol. 62, pp. 1–16, Jan. 2024
work page 2024
-
[48]
Mambaout: Do we really need mamba for vision?
W. Yu and X. Wang, “Mambaout: Do we really need mamba for vision?” arXiv preprint arXiv:2405.07992, 2024
-
[49]
Hyperspectral and sar image classification via multiscale interactive fusion network,
J. Wang, W. Li, Y . Gao, M. Zhang, R. Tao, and Q. Du, “Hyperspectral and sar image classification via multiscale interactive fusion network,” IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 12, pp. 10 823– 10 837, Dec. 2023
work page 2023
-
[50]
Joint classification of hyperspectral and lidar data using a hierarchical cnn and transformer,
G. Zhao, Q. Ye, L. Sun, Z. Wu, C. Pan, and B. Jeon, “Joint classification of hyperspectral and lidar data using a hierarchical cnn and transformer,” IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–16, Jan. 2023
work page 2023
-
[51]
Asymmetric feature fusion network for hyperspectral and sar image classification,
W. Li, Y . Gao, M. Zhang, R. Tao, and Q. Du, “Asymmetric feature fusion network for hyperspectral and sar image classification,”IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 10, pp. 8057–8070, Oct. 2023
work page 2023
-
[52]
Coupled adversarial learning for fusion classification of hyperspectral and lidar data,
T. Lu, K. Ding, W. Fu, S. Li, and A. Guo, “Coupled adversarial learning for fusion classification of hyperspectral and lidar data,”Inf. Fusion, vol. 93, pp. 118–131, May 2023
work page 2023
-
[53]
K. Li, D. Wang, X. Wang, G. Liu, Z. Wu, and Q. Wang, “Mixing self- attention and convolution: A unified framework for multi-source remote sensing data classification,”IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–16, Sep. 2023
work page 2023
-
[54]
K. Ding, T. Lu, and S. Li, “Uncertainty-aware contrastive learning for semi-supervised classification of multimodal remote sensing images,” IEEE Trans. Geosci. Remote Sens., vol. 62, pp. 1–13, May 2024
work page 2024
-
[55]
Domain information mining and state-guided adaptation network for multispectral image segmentation,
B. Zhao, M. Zhang, W. Li, Y . Gao, and J. Wang, “Domain information mining and state-guided adaptation network for multispectral image segmentation,”IEEE Trans. Neural Netw. Learn. Syst., pp. 1–15, 2025
work page 2025
-
[56]
Earthmind: Towards multi-granular and multi- sensor earth observation with large multimodal models,
Y . Shu, B. Ren, Z. Xiong, D. P. Paudel, L. Van Gool, B. Demir, N. Sebe, and P. Rota, “Earthmind: Towards multi-granular and multi- sensor earth observation with large multimodal models,”arXiv preprint arXiv:2506.01667, 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.