arxiv: 2604.09948 · v1 · submitted 2026-04-10 · 💻 cs.CV

Unmixing-Guided Spatial-Spectral Mamba with Clustering Tokens for Hyperspectral Image Classification

Yimin Zhu , Lincoln Linlin Xu This is my paper

Pith reviewed 2026-05-10 16:39 UTC · model grok-4.3

classification 💻 cs.CV

keywords hyperspectral image classificationspectral unmixingMamba modelclustering tokensabundance mapsendmember variabilityspatial-spectral featuresmulti-task learning

0 comments p. Extension

The pith

Spectral unmixing combined with abundance-guided clustering tokens and a spatial-spectral Mamba module improves hyperspectral image classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework that first uses a spectral unmixing network to automatically extract endmembers and abundance maps from hyperspectral data while modeling endmember variations. These abundance maps then define clusters, from which a Top-K selection strategy builds efficient token sequences for input to a custom Mamba module that processes both spatial and spectral information. A multi-task training scheme jointly supervises unmixing and classification so the model produces not only class maps but also interpretable spectral libraries and abundance estimates. Readers would care because hyperspectral classification underpins applications such as land monitoring and resource assessment, where mixed pixels and boundary loss currently limit reliability.

Core claim

The authors establish that a spectral unmixing network that learns endmembers and abundance maps while accounting for variabilities, followed by abundance-map cluster definition, Top-K token selection for sequencing, and an unmixing-guided spatial-spectral Mamba module inside a multi-task unmixing-classification framework, yields classification maps that outperform prior state-of-the-art methods on four hyperspectral datasets while also delivering spectral libraries and abundance maps.

What carries the argument

The unmixing-guided spatial-spectral Mamba module, which receives adaptively sequenced tokens from abundance-map-derived clusters via Top-K selection to perform joint spatial-spectral feature learning.

If this is right

The model simultaneously produces accurate classification maps, a spectral library, and abundance maps from a single training process.
Accounting for endmember variabilities inside the unmixing network reduces errors from mixed pixels in heterogeneous scenes.
Top-K token sequencing based on abundance clusters improves Mamba's ability to capture spatial-spectral patterns while preserving boundaries.
The multi-task supervision allows the learned representations to support both physical unmixing and semantic classification objectives.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The physical outputs (endmembers and abundances) could support downstream tasks such as material identification or change detection that require interpretable spectral decomposition.
The clustering-token approach might extend to other sequence models in remote sensing where pixel mixing is common, such as multispectral or LiDAR data.
Varying the number of clusters or the K value in Top-K selection could be tested on scenes with different levels of spatial complexity to optimize the trade-off between efficiency and detail preservation.

Load-bearing premise

The spectral unmixing network accurately disentangles mixtures and accounts for endmember variabilities, and the abundance-map-derived clusters plus Top-K selection meaningfully improve Mamba token sequencing and feature learning without loss of critical boundary information.

What would settle it

Direct comparison of overall accuracy, average accuracy, and kappa coefficient on the same four HSI datasets against the reported state-of-the-art baselines; absence of consistent outperformance would falsify the central performance claim.

Figures

Figures reproduced from arXiv: 2604.09948 by Lincoln Linlin Xu, Yimin Zhu.

**Figure 1.** Figure 1: Traditional Mamba token sequencing treats an entire image or uses a pre-defined way—despite its highly heterogeneous spatial patterns—as a single long sequence, which suffers from being inefficient at modeling fine-scale structures. In contrast, our Unmixing-guided per-abundance token sequencing decomposes the hyperspectral image into physically meaningful abundance maps, which better capture subtle and we… view at source ↗

**Figure 2.** Figure 2: The proposed model contains (1) a semi-blind spectral unmixing branch for abundance learning and endmember [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: The classification map of (a) RF (b) SSRN (c) SS [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: The estimated abundance map from (a) DSNet, (b) [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

read the original abstract

Although hyperspectral image (HSI) classification is critical for supporting various environmental applications, it is a challenging task due to the spectral-mixture effect, the spatial-spectral heterogeneity and the difficulty to preserve class boundaries and details. This letter presents a novel unmixing-guided spatial-spectral Mamba with clustering tokens for improved HSI classification, with the following contributions. First, to disentangle the spectral mixture effect in HSI for improved pattern discovery, we design a novel spectral unmixing network that not only automatically learns endmembers and abundance maps from HSI but also accounts for endmember variabilities. Second, to generate Mamba token sequences, based on the clusters defined by abundance maps, we design an efficient Top-\textit{K} token selection strategy to adaptively sequence the tokens for improved representational capability. Third, to improve spatial-spectral feature learning and detail preservation, based on the Top-\textit{K} token sequences, we design a novel unmixing-guided spatial-spectral Mamba module that greatly improves traditional Mamba models in terms of token learning and sequencing. Fourth, to learn simultaneously the endmember-abundance patterns and classification labels, a multi-task scheme is designed for model supervision, leading to a new unmixing-classification framework that outputs not only accurate classification maps but also a comprehensive spectral-library and abundance maps. Comparative experiments on four HSI datasets demonstrate that our model can greatly outperform the other state-of-the-art approaches. Code is available at https://github.com/GSIL-UCalgary/Unmixing_guided_Mamba.git

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper ties spectral unmixing and abundance-based token clustering into a Mamba backbone for HSI classification in a multi-task setup, and the design looks internally consistent with standard practices.

read the letter

The core new piece is using a learned unmixing network that produces abundance maps, then deriving clusters from those maps to drive Top-K token selection and sequencing inside a spatial-spectral Mamba module, all trained jointly with the classification head. This directly targets the mixing problem and tries to keep boundary detail without the usual heavy token overhead of transformers. The multi-task supervision and the decision to release code are both practical moves that make the work easier to build on. The architecture choices follow logically from the stated goals of handling endmember variability and improving token ordering. The main soft spot is that the abstract asserts clear outperformance on four datasets without any numbers, error bars, or ablation details visible here, so the size of the gains and whether the unmixing step actually delivers accurate abundances remain to be checked in the results section. The Top-K selection could still risk dropping small boundary regions even if the clustering is meant to protect them. This is aimed at the HSI classification community and people experimenting with Mamba variants in remote sensing. A reader working on efficient spectral-spatial models would find the token strategy and joint training worth looking at. The work is coherent enough on its own terms to merit a full referee process rather than a desk reject.

Referee Report

2 major / 4 minor

Summary. The paper proposes a novel unmixing-guided spatial-spectral Mamba architecture with clustering tokens for hyperspectral image (HSI) classification. It introduces a spectral unmixing network that learns endmembers and abundance maps while modeling endmember variabilities, employs abundance-map-derived clusters with a Top-K token selection strategy to generate sequences for a Mamba module, designs an unmixing-guided spatial-spectral Mamba for improved feature learning and boundary preservation, and uses multi-task supervision to jointly optimize unmixing and classification. The central claim is that this framework greatly outperforms state-of-the-art methods on four HSI datasets, while also outputting spectral libraries and abundance maps; code is publicly released.

Significance. If the empirical results hold, the work could meaningfully advance HSI classification by explicitly addressing spectral mixture effects through unmixing and by adapting Mamba token sequencing for spatial-spectral data, potentially improving detail preservation in heterogeneous scenes. The multi-task output of both classification maps and physically interpretable abundance maps adds practical value for remote-sensing applications. Public code availability is a clear strength supporting reproducibility.

major comments (2)

[§4] §4 (Experiments) and Table 2/3: The central claim of outperformance over SOTA methods on four datasets is load-bearing, yet the manuscript must supply complete quantitative results (OA, AA, Kappa) with standard deviations across multiple runs or cross-validation folds; without these, the superiority cannot be rigorously assessed against the weakest assumption that the unmixing and clustering steps meaningfully improve performance.
[§3.1] §3.1 (Spectral Unmixing Network): The assertion that the network accounts for endmember variabilities is central to the first contribution and to the multi-task framework, but no quantitative validation of unmixing quality (e.g., reconstruction error, SAD, or abundance RMSE) or ablation isolating its effect on classification accuracy is provided; this leaves open whether the unmixing component drives the reported gains or is incidental.

minor comments (4)

[Abstract] Abstract: The four datasets are not named; explicitly listing them (e.g., Indian Pines, Pavia University, etc.) would improve immediate clarity.
[§3.2] §3.2 (Top-K token selection): The choice of K and the number of clusters/endmembers are free parameters; a brief sensitivity analysis or default values with justification would strengthen the method description.
[Figure 1] Figure 1 (architecture diagram): Labels for the flow from abundance maps to clustering and Top-K selection could be enlarged or annotated more clearly to aid readers.
[§3] Notation: Ensure consistent symbols for endmembers (E) and abundances (A) across equations and text; minor inconsistencies appear in the method section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for minor revision. We address the two major comments below and will update the manuscript to incorporate the suggested improvements for stronger empirical validation.

read point-by-point responses

Referee: [§4] §4 (Experiments) and Table 2/3: The central claim of outperformance over SOTA methods on four datasets is load-bearing, yet the manuscript must supply complete quantitative results (OA, AA, Kappa) with standard deviations across multiple runs or cross-validation folds; without these, the superiority cannot be rigorously assessed against the weakest assumption that the unmixing and clustering steps meaningfully improve performance.

Authors: We agree that standard deviations are necessary for rigorous assessment of the reported gains. In the revised version, we will rerun the experiments on all four datasets with multiple random seeds (at least five independent trials) and report mean OA, AA, and Kappa values along with their standard deviations in Tables 2 and 3. This will allow direct comparison against the possibility that unmixing and clustering contribute meaningfully to performance. revision: yes
Referee: [§3.1] §3.1 (Spectral Unmixing Network): The assertion that the network accounts for endmember variabilities is central to the first contribution and to the multi-task framework, but no quantitative validation of unmixing quality (e.g., reconstruction error, SAD, or abundance RMSE) or ablation isolating its effect on classification accuracy is provided; this leaves open whether the unmixing component drives the reported gains or is incidental.

Authors: We acknowledge the value of explicit quantitative validation for the unmixing network. We will add a dedicated subsection in the experiments reporting reconstruction error, spectral angle distance (SAD), and abundance RMSE to demonstrate the quality of the learned endmembers and abundances. We will also include an ablation study that removes the unmixing guidance and compares classification accuracy to isolate its contribution. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces an end-to-end trainable architecture that jointly optimizes a spectral unmixing network (producing endmembers and abundance maps) and a classification head via multi-task supervision. Token sequencing via Top-K clustering on abundance maps and the subsequent unmixing-guided Mamba module are architectural choices whose outputs are not algebraically or statistically forced to equal their inputs; the unmixing objective and classification objective remain distinct loss terms. Comparative results on four public HSI datasets are reported against external baselines without any fitted parameter being relabeled as a prediction or any uniqueness claim resting on prior self-citation. The derivation chain is therefore self-contained and externally falsifiable through standard benchmark performance.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 2 invented entities

Only the abstract is available, so the ledger is necessarily incomplete. The central claim rests on the effectiveness of newly proposed modules whose internal mechanics and hyperparameter choices are not detailed.

free parameters (2)

K for Top-K token selection
Hyperparameter controlling how many tokens are selected per cluster from abundance maps.
Number of endmembers/clusters
Determines the dimensionality of the unmixing output and clustering; likely dataset-dependent.

axioms (1)

domain assumption Hyperspectral data can be modeled as linear mixtures of endmembers with additive variability
Underlies the design of the spectral unmixing network.

invented entities (2)

Unmixing-guided spatial-spectral Mamba module no independent evidence
purpose: To enhance token learning and sequencing by incorporating unmixing information
New module proposed to improve upon standard Mamba for HSI data.
Clustering tokens via abundance maps no independent evidence
purpose: To adaptively generate token sequences for Mamba based on unmixing clusters
Novel selection strategy tied to the unmixing output.

pith-pipeline@v0.9.0 · 5589 in / 1455 out tokens · 49443 ms · 2026-05-10T16:39:16.770917+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

spectral unmixing network that automatically learns endmembers and abundance maps... Top-K token selection strategy... unmixing-guided spatial-spectral Mamba module... multi-task scheme
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

LMM: H = S^T A + N, s.t. A ≥ 0, 1^T A = J; abundance A = |F_abu(p)| / sum |F_abu(p)|

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

11 extracted references · 11 canonical work pages

[1]

11em plus .33em minus .07em 4000 4000 100 4000 4000 500 `\.=1000 = #1 \@IEEEnotcompsoconly \@IEEEcompsoconly #1 * [1] 0pt [0pt][0pt] #1 * [1] 0pt [0pt][0pt] #1 * \| ** #1 \@IEEEauthorblockNstyle \@IEEEcompsocnotconfonly \@IEEEauthorblockAstyle \@IEEEcompsocnotconfonly \@IEEEcompsocconfonly \@IEEEauthordefaulttextstyle \@IEEEcompsocnotconfonly \@IEEEauthor...

work page
[2]

Z. Han, J. Yang, L. Gao, Z. Zeng, B. Zhang, and J. Chanussot, ``Subpixel spectral variability network for hyperspectral image classification,'' IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1--14, 2025

work page 2025
[3]

------, ``Dual-branch subpixel-guided network for hyperspectral image classification,'' IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1--13, 2024

work page 2024
[4]

Zhong, J

Z. Zhong, J. Li, Z. Luo, and M. Chapman, ``Spectral–spatial residual network for hyperspectral image classification: A 3-d deep learning framework,'' IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 2, pp. 847--858, 2018

work page 2018
[5]

Zhong, Y

Z. Zhong, Y. Li, L. Ma, J. Li, and W.-S. Zheng, ``Spectral–spatial transformer network for hyperspectral image classification: A factorized architecture search framework,'' IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1--15, 2022

work page 2022
[6]

X. Liu, C. Zhang, F. Huang, S. Xia, G. Wang, and L. Zhang, ``Vision mamba: A comprehensive survey and taxonomy,'' IEEE Transactions on Neural Networks and Learning Systems, 2025

work page 2025
[7]

K. Han, A. Vedaldi, and A. Zisserman, ``Learning to discover novel visual categories via deep transfer clustering,'' in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 8401--8409

work page 2019
[8]

Y. Zhu, K. Yuan, W. Zhong, and L. Xu, ``Spatial–spectral convnext for hyperspectral image classification,'' IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 16, pp. 5453--5463, 2023

work page 2023
[9]

Y. Li, Y. Luo, L. Zhang, Z. Wang, and B. Du, ``Mambahsi: Spatial–spectral mamba for hyperspectral image classification,'' IEEE Transactions on Geoscience and Remote Sensing, vol. 62, pp. 1--16, 2024

work page 2024
[10]

Z. Meng, L. Yue, and F. Zhao, ``Spatial-frequency joint learning mamba for hyperspectral image classification,'' IEEE Geoscience and Remote Sensing Letters, vol. 23, pp. 1--5, 2026

work page 2026
[11]

L. L. Xu, Y. Zhu, Z. Dewis, Z. Xu, M. Alkayid, M. Heffring, and S. Taleghanidoozdoozan, ``Sparse deformable mamba for hyperspectral image classification,'' IEEE Geoscience and Remote Sensing Letters, vol. 22, pp. 1--5, 2025

work page 2025