HyperFM: An Efficient Hyperspectral Foundation Model with Spectral Grouping
Pith reviewed 2026-05-09 23:52 UTC · model grok-4.3
The pith
HyperFM uses spectral grouping with intra- and inter-group attention plus hybrid parameter decomposition to build an efficient foundation model that improves cloud property retrieval from PACE hyperspectral data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
HyperFM is a parameter-efficient hyperspectral foundation model that leverages intra-group and inter-group spectral attention along with hybrid parameter decomposition to capture complex spectral-spatial relationships in PACE observations. It delivers consistent performance improvements over existing hyperspectral foundation models and task-specific state-of-the-art methods across four benchmark downstream atmospheric cloud property retrieval tasks while supporting both clear and cloudy scenes.
What carries the argument
Intra-group and inter-group spectral attention with hybrid parameter decomposition, which partitions the spectrum into groups to model local and global dependencies while keeping parameter count low.
If this is right
- Consistent gains on cloud microphysics and related atmospheric retrievals from full-spectrum PACE data.
- Lower parameter count and faster inference than prior hyperspectral foundation models, enabling operational use.
- Handling of both clear-sky and cloudy scenes within a single model.
- Release of the HyperFM250K dataset for training or fine-tuning additional models.
Where Pith is reading between the lines
- The grouping strategy might extend to other instruments whose band counts differ from PACE, provided the intra- and inter-group logic is re-tuned.
- Reduced compute demand could support on-board or near-real-time processing of satellite streams for air-quality alerts.
- If the efficiency holds, similar decomposition patterns may apply to other high-dimensional remote-sensing modalities such as multi-temporal stacks.
Load-bearing premise
The combination of spectral grouping, intra- and inter-group attention, and hybrid decomposition will capture the needed relationships in hyperspectral data without overfitting to PACE or requiring large labeled sets.
What would settle it
Evaluating HyperFM on hyperspectral observations from a different satellite sensor or on a retrieval task outside the four cloud-property benchmarks and finding no improvement over current baselines would show the claimed gains do not hold.
Figures
read the original abstract
The NASA PACE mission provides unprecedented hyperspectral observations of ocean color, aerosols, and clouds, offering new insights into how these components interact and influence Earth's climate and air quality. Its Ocean Color Instrument measures light across hundreds of finely spaced wavelength bands, enabling detailed characterization of features such as phytoplankton composition, aerosol properties, and cloud microphysics. However, hyperspectral data of this scale is large, complex, and difficult to label, requiring specialized processing and analysis techniques. Existing foundation models, which have transformed computer vision and natural language processing, are generally trained on standard RGB imagery and therefore struggle to interpret the continuous spectral signatures captured by PACE. While recent advances have introduced hyperspectral foundation models, they are typically trained on cloud-free observations and often remain limited to single-sensor datasets due to spectral inconsistencies across instruments. Moreover, existing models tend to be parameter-heavy and computationally expensive, limiting scalability and adoption in operational settings. To address these challenges, we introduce HyperFM, a parameter-efficient hyperspectral foundation model that leverages intra-group and inter-group spectral attention along with hybrid parameter decomposition to better capture spectral spatial relationships while reducing computational cost. HyperFM demonstrates consistent performance improvements over existing hyperspectral foundation models and task-specific state-of-the-art methods across four benchmark downstream atmospheric cloud property retrieval tasks. To support further research, we additionally release HyperFM250K, a large-scale hyperspectral dataset from the PACE mission that includes both clear and cloudy scenes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces HyperFM, a parameter-efficient hyperspectral foundation model that uses intra-group and inter-group spectral attention combined with hybrid parameter decomposition to capture spectral-spatial relationships in large-scale PACE hyperspectral observations. It claims consistent performance improvements over prior hyperspectral foundation models and task-specific SOTA methods on four downstream atmospheric cloud property retrieval benchmarks, while also releasing the HyperFM250K dataset containing both clear and cloudy scenes from the PACE mission.
Significance. If the reported gains are shown to stem from the proposed architectural mechanisms rather than dataset differences, HyperFM could advance scalable hyperspectral modeling for Earth observation applications such as cloud microphysics retrieval from the PACE Ocean Color Instrument. The dataset release would further support community research on cloudy hyperspectral scenes.
major comments (3)
- [Abstract] Abstract: The central claim of 'consistent performance improvements' over existing hyperspectral foundation models and task-specific SOTA methods supplies no numerical metrics, error bars, baseline details, or experimental protocol for the four cloud property tasks, making it impossible to evaluate whether the data support the claim.
- [Experiments] Experiments section: The manuscript provides no evidence of controlled comparisons in which prior hyperspectral models are retrained or adapted on the new HyperFM250K dataset (which includes cloudy scenes absent from prior cloud-free training data); without such isolation or component ablations on intra-group/inter-group attention and hybrid decomposition, gains cannot be attributed to the architecture rather than distribution shift.
- [Model Architecture] Model description: The hybrid parameter decomposition and spectral grouping mechanisms are described at a high level without equations quantifying parameter reduction or computational cost relative to baselines, which is load-bearing for the efficiency claims that underpin the model's positioning as scalable for operational use.
minor comments (2)
- [Abstract] The abstract would be strengthened by briefly noting key quantitative results or at least the specific cloud property tasks (e.g., optical depth, effective radius) to allow readers to gauge the scope of the improvements.
- [Model Architecture] Notation for intra-group and inter-group attention should be defined more explicitly with reference to standard transformer attention formulations to improve clarity for readers unfamiliar with hyperspectral adaptations.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback, which has helped us strengthen the manuscript. We address each major comment point by point below, indicating revisions made to the next version.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim of 'consistent performance improvements' over existing hyperspectral foundation models and task-specific SOTA methods supplies no numerical metrics, error bars, baseline details, or experimental protocol for the four cloud property tasks, making it impossible to evaluate whether the data support the claim.
Authors: We agree that the abstract would benefit from explicit quantitative support. In the revised manuscript, we have updated the abstract to report key metrics, including average relative improvements (with standard deviations) across the four cloud property retrieval tasks, the specific baselines used, and a brief reference to the evaluation protocol and dataset splits detailed in Section 4. Full tables with error bars remain in the experiments section. revision: yes
-
Referee: [Experiments] Experiments section: The manuscript provides no evidence of controlled comparisons in which prior hyperspectral models are retrained or adapted on the new HyperFM250K dataset (which includes cloudy scenes absent from prior cloud-free training data); without such isolation or component ablations on intra-group/inter-group attention and hybrid decomposition, gains cannot be attributed to the architecture rather than distribution shift.
Authors: This concern is valid and we have addressed it directly. The revised experiments section now includes (i) results for prior hyperspectral foundation models fine-tuned on HyperFM250K to control for dataset effects, and (ii) targeted ablations that isolate the contributions of intra-group attention, inter-group attention, and the hybrid parameter decomposition. These additions demonstrate that the architectural components yield measurable gains even after accounting for the inclusion of cloudy scenes. revision: yes
-
Referee: [Model Architecture] Model description: The hybrid parameter decomposition and spectral grouping mechanisms are described at a high level without equations quantifying parameter reduction or computational cost relative to baselines, which is load-bearing for the efficiency claims that underpin the model's positioning as scalable for operational use.
Authors: We acknowledge that the original description was insufficiently quantitative. The revised model section now provides the explicit mathematical formulations for spectral grouping (intra- and inter-group attention) and the hybrid decomposition (combining low-rank and grouped factors). We have also added a dedicated efficiency table reporting parameter counts, FLOPs, and inference latency relative to the main baselines, directly supporting the scalability claims. revision: yes
Circularity Check
No circularity: empirical performance claims with no derivations or self-referential reductions
full rationale
The paper introduces HyperFM as an architectural innovation (intra-group/inter-group spectral attention plus hybrid decomposition) and reports empirical gains on four downstream cloud property tasks using the new HyperFM250K dataset. No equations, derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the abstract or structure. The central claim is a falsifiable empirical statement comparing model performance, not a mathematical reduction to its own inputs. The skeptic concern about dataset confounding is a validity issue, not circularity. This is a standard self-contained empirical contribution.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 2 Pith papers
-
Foundation AI Models for Aerosol Optical Depth Estimation from PACE Satellite Data
ViTCG, a channel-grouped Vision Transformer, retrieves AOD from PACE hyperspectral data with 62% lower MSE than prior foundation models while producing spatially coherent fields.
-
SpectralEarth-FM: Bringing Hyperspectral Imagery into Multimodal Earth Observation Pretraining
SpectralEarth-FM is a multisensor hierarchical transformer pretrained on a 40TB co-located HSI-MSI-SAR dataset using a JEPA-style objective and reports state-of-the-art results on hyperspectral and standard EO benchmarks.
Reference graph
Works this paper leans on
-
[1]
Self- supervised material and texture representation learning for remote sensing tasks
Peri Akiva, Matthew Purri, and Matthew Leotta. Self- supervised material and texture representation learning for remote sensing tasks. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 8203–8215, 2022. 2
work page 2022
-
[2]
Foundation models defining a new era in vision: a survey and outlook
Muhammad Awais, Muzammal Naseer, Salman Khan, Rao Muhammad Anwer, Hisham Cholakkal, Mubarak Shah, Ming-Hsuan Yang, and Fahad Shahbaz Khan. Foundation models defining a new era in vision: a survey and outlook. IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 2025. 2
work page 2025
-
[3]
Spec- tralearth: Training hyperspectral foundation models at scale
Nassim Ait Ali Braham, Conrad M Albrecht, Julien Mairal, Jocelyn Chanussot, Yi Wang, and Xiao Xiang Zhu. Spec- tralearth: Training hyperspectral foundation models at scale. IEEE Journal of Selected Topics in Applied Earth Observa- tions and Remote Sensing, 2025. 1, 2, 3, 5, 6, 8, 4
work page 2025
-
[4]
Pace ocean color in- strument (oci) version 3.1 data products overview.https: / / pace
NASA Goddard Space Flight Center. Pace ocean color in- strument (oci) version 3.1 data products overview.https: / / pace . oceansciences . org / access _ pace _ data.htm, 2024. Plankton, Aerosol, Cloud, ocean Ecosys- tem (PACE) Mission. 2
work page 2024
-
[5]
Gordon Christie, Neil Fendley, James Wilson, and Ryan Mukherjee. Functional map of the world. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6172–6180, 2018. 1
work page 2018
-
[6]
Yezhen Cong, Samar Khanna, Chenlin Meng, Patrick Liu, Erik Rozi, Yutong He, Marshall Burke, David Lobell, and Stefano Ermon. Satmae: Pre-training transformers for tem- poral and multi-spectral satellite imagery.Advances in Neu- ral Information Processing Systems, 35:197–211, 2022. 1, 2
work page 2022
-
[7]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020. 1, 2, 4, 5
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[8]
Martin Hermann Paul Fuchs and Beg ¨um Demir. Hyspecnet- 11k: A large-scale hyperspectral dataset for benchmarking learning-based hyperspectral image compression methods. InIGARSS 2023-2023 IEEE International Geoscience and Remote Sensing Symposium, pages 1779–1782. IEEE, 2023. 5
work page 2023
-
[9]
Xin Guo, Jiangwei Lao, Bo Dang, Yingying Zhang, Lei Yu, Lixiang Ru, Liheng Zhong, Ziyuan Huang, Kang Wu, Dingxiang Hu, et al. Skysense: A multi-modal remote sens- ing foundation model towards universal interpretation for earth observation imagery. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 27672–27683, 2024. 2
work page 2024
-
[10]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceed- ings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. 2
work page 2016
-
[11]
Yuting He, Fuxiang Huang, Xinrui Jiang, Yuxiang Nie, Minghao Wang, Jiguang Wang, and Hao Chen. Foundation model for advancing healthcare: Challenges, opportunities and future directions.IEEE Reviews in Biomedical Engi- neering, 2024. 1
work page 2024
-
[12]
Patrick Helber, Benjamin Bischke, Andreas Dengel, and Damian Borth. Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(7):2217–2226, 2019. 1
work page 2019
-
[13]
Oleksii Hrinchuk, Valentin Khrulkov, Leyla Mirvakhabova, Elena Orlova, and Ivan Oseledets. Tensorized embedding layers. InFindings of the association for computational lin- guistics: EMNLP 2020, pages 4847–4860, 2020. 2
work page 2020
-
[14]
He Huang, Quan Wang, Chao Liu, and Chen Zhou. Optimal estimation of cloud properties from thermal infrared obser- vations with a combination of deep learning and radiative transfer simulation.Atmospheric Measurement Techniques, 17(24):7129–7141, 2024. 1
work page 2024
-
[15]
Intergovernmental Panel on Climate Change (IPCC). IPCC Official Website, 2024. Accessed: 2024-12-23. 1, 6, 2
work page 2024
-
[16]
Thomas A Jones, David J Stensrud, Patrick Minnis, and Ra- bindra Palikonda. Evaluation of a forward operator to assim- ilate cloud water path into wrf-dart.Monthly weather review, 141(7):2272–2289, 2013. 6, 2
work page 2013
-
[17]
Science plan of the environmental map- ping and analysis program (enmap)
Hermann Kaufmann, S F ¨orster, Hendrik Wulf, K Segl, Luis Guanter, M Bochow, U Heiden, A M ¨uller, W Heldens, T Schneiderhan, et al. Science plan of the environmental map- ping and analysis program (enmap). 2012. 2, 1
work page 2012
-
[18]
Convection di- agnosis and nowcasting for oceanic aviation applications
Cathy Kessinger, Michael Donovan, Richard Bankert, Earle Williams, Jeffrey Hawkins, Huaqing Cai, Nancy Rehak, Daniel Megenhardt, and Matthias Steiner. Convection di- agnosis and nowcasting for oceanic aviation applications. In Remote Sensing Applications for Aviation Weather Hazard Detection and Decision Support, pages 77–88. SPIE, 2008. 6, 2
work page 2008
-
[19]
Jingwei Li, Feng Zhang, Wenwen Li, Xuan Tong, BaoXi- ang Pan, Jun Li, Han Lin, Husi Letu, and Frahan Mustafa. Transfer-learning-based approach to retrieve the cloud prop- erties using diverse remote sensing datasets.IEEE Transac- tions on Geoscience and Remote Sensing, 2023. 1, 2, 6, 8, 4
work page 2023
-
[20]
Jingtao Li, Yingyi Liu, Xinyu Wang, Yunning Peng, Chen Sun, Shaoyu Wang, Zhendong Sun, Tian Ke, Xiao Jiang, Tangwei Lu, et al. Hyperfree: A channel-adaptive and tuning-free foundation model for hyperspectral remote sens- ing imagery. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 23048–23058, 2025. 2, 6, 8, 3, 4
work page 2025
-
[21]
Hypoformer: Hybrid decomposition transformer for edge-friendly neural machine translation
Sunzhu Li, Peng Zhang, Guobing Gan, Xiuqing Lv, Benyou Wang, Junqiu Wei, and Xin Jiang. Hypoformer: Hybrid decomposition transformer for edge-friendly neural machine translation. InProceedings of the 2022 conference on empir- ical methods in natural language processing, pages 7056– 7068, 2022. 2, 4, 5, 7
work page 2022
-
[22]
Wenwen Li, Feng Zhang, Bin Guo, Haoyang Fu, and Husi Letu. Physics-driven machine learning algorithm facilitates multilayer cloud property retrievals from geostationary pas- sive imager measurements.IEEE Transactions on Geo- science and Remote Sensing, 62:1–18, 2024. 1
work page 2024
-
[23]
S2mae: A spatial-spectral pretraining foundation model for spectral remote sensing data
Xuyang Li, Danfeng Hong, and Jocelyn Chanussot. S2mae: A spatial-spectral pretraining foundation model for spectral remote sensing data. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 24088–24097, 2024. 2
work page 2024
-
[24]
Goddard Space Flight Center, 2002
Rebecca Lindsey and David Herring.MODIS: Moderate Resolution Imaging Spectroradiometer: NASA’s Earth Ob- serving System. Goddard Space Flight Center, 2002. 2
work page 2002
-
[25]
Fan Liu, Delong Chen, Zhangqingyun Guan, Xiaocong Zhou, Jiale Zhu, Qiaolin Ye, Liyong Fu, and Jun Zhou. Re- moteclip: A vision language foundation model for remote sensing.IEEE Transactions on Geoscience and Remote Sensing, 62:1–16, 2024. 1
work page 2024
- [26]
-
[27]
Swin transformer: Hierarchical vision transformer using shifted windows
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021. 2
work page 2021
-
[28]
Teruyuki Nakajima and Michael D King. Determination of the optical thickness and effective particle radius of clouds from reflected solar radiation measurements. part i: Theory. Journal of Atmospheric Sciences, 47(15):1878–1893, 1990. 1, 2
work page 1990
-
[29]
PACE Sci- ence Data Reprocessing Version 3.x Notes.https : / / oceancolor
NASA Goddard Space Flight Center. PACE Sci- ence Data Reprocessing Version 3.x Notes.https : / / oceancolor . gsfc . nasa . gov / files / data / reprocessing/V3/PACE_Reprocessing_V3.x_ notes.pdf, 2025. Accessed: 2025-11-20. 1
work page 2025
-
[30]
Vikas Nataraja, Sebastian Schmidt, Hong Chen, Takanobu Yamaguchi, Jan Kazil, Graham Feingold, Kevin Wolf, and Hironobu Iwabuchi. Segmentation-based multi-pixel cloud optical thickness retrieval using a convolutional neural net- work.Atmospheric Measurement Techniques Discussions, pages 1–34, 2022. 1, 2, 6, 8, 4
work page 2022
-
[31]
Towards the copernicus hy- perspectral imaging mission for the environment (chime)
Jens Nieke and Michael Rast. Towards the copernicus hy- perspectral imaging mission for the environment (chime). In Igarss 2018-2018 ieee international geoscience and remote sensing symposium, pages 157–159. IEEE, 2018. 2, 1
work page 2018
-
[32]
Compressing pre- trained language models by matrix decomposition
Matan Ben Noach and Yoav Goldberg. Compressing pre- trained language models by matrix decomposition. InPro- ceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Pro- cessing, pages 884–889, 2020. 2
work page 2020
-
[33]
Rethinking transformers pre-training for multi- spectral satellite imagery
Mubashir Noman, Muzammal Naseer, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, and Fahad Shah- baz Khan. Rethinking transformers pre-training for multi- spectral satellite imagery. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 27811–27819, 2024. 1, 2
work page 2024
-
[34]
Rintaro Okamura, Hironobu Iwabuchi, and K Sebastian Schmidt. Feasibility study of multi-pixel retrieval of opti- cal thickness and droplet effective radius of inhomogeneous clouds using deep learning.Atmospheric Measurement Tech- niques, 10(12):4747–4759, 2017. 1, 2, 6
work page 2017
-
[35]
Stefano Pignatti, Angelo Palombo, Simone Pascucci, Filom- ena Romano, Federico Santini, Tiziana Simoniello, Amato Umberto, Cuomo Vincenzo, Nicola Acito, Marco Diani, et al. The prisma hyperspectral mission: Science activi- ties and opportunities for agriculture and land monitoring. In2013 IEEE international geoscience and remote sensing symposium-IGARSS, ...
work page 2013
-
[36]
S Platnick, S Ackerman, M King, K Meyer, WP Men- zel, RE Holz, BA Baum, and P Yang. Modis atmosphere l2 cloud product (06 l2), nasa modis adaptive processing system, goddard space flight center.URL http://dx. doi. org/10.5067/MODIS/MOD06 L, 2, 2015. 3, 1
-
[37]
S Platnick, KG Meyer, P Hubanks, R Holz, SA Ackerman, and AK Heidinger. Viirs atmosphere l3 cloud properties product.Version-1.1, NASA Level-1 and Atmosphere Archive & Distribution System (LAADS) Distributed Active Archive Center (DAAC), Goddard Space Flight Center, 2019. 3, 1
work page 2019
-
[38]
CA Poulsen, R Siddans, GE Thomas, AM Sayer, RG Grainger, E Campmany, SM Dean, C Arnold, and PD Watts. Cloud retrievals from satellite data using optimal estimation: evaluation and application to atsr.Atmospheric Measurement Techniques, 5(8):1889–1910, 2012. 2
work page 1910
-
[39]
Learning transferable visual models from natural language supervi- sion
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervi- sion. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021. 1
work page 2021
-
[40]
Zero-shot text-to-image generation
Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea V oss, Alec Radford, Mark Chen, and Ilya Sutskever. Zero-shot text-to-image generation. InInternational confer- ence on machine learning, pages 8821–8831. Pmlr, 2021. 1
work page 2021
-
[41]
Scale-mae: A scale-aware masked autoencoder for multiscale geospatial representation learning
Colorado J Reed, Ritwik Gupta, Shufan Li, Sarah Brock- man, Christopher Funk, Brian Clipp, Kurt Keutzer, Salvatore Candido, Matt Uyttendaele, and Trevor Darrell. Scale-mae: A scale-aware masked autoencoder for multiscale geospatial representation learning. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 4088– 4099, 2023. 2, 6
work page 2023
-
[42]
Masked vision transformers for hyperspectral image classi- fication
Linus Scheibenreif, Michael Mommert, and Damian Borth. Masked vision transformers for hyperspectral image classi- fication. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2166–2176,
-
[43]
Vladan Stojnic and Vladimir Risojevic. Self-supervised learning of remote sensing scene representations using con- trastive multiview coding. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1182–1191, 2021. 2
work page 2021
-
[44]
Bigearthnet: A large-scale benchmark archive for remote sensing image understanding
Gencer Sumbul, Marcela Charfuelan, Beg ¨um Demir, and V olker Markl. Bigearthnet: A large-scale benchmark archive for remote sensing image understanding. InIGARSS 2019- 2019 IEEE international geoscience and remote sensing symposium, pages 5901–5904. IEEE, 2019. 1
work page 2019
-
[45]
Rank and run-time aware compression of nlp applications.arXiv preprint arXiv:2010.03193, 2020
Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika, and Matthew Mattina. Rank and run-time aware compression of nlp applications.arXiv preprint arXiv:2010.03193, 2020. 2
-
[46]
Maxvit: Multi-axis vision transformer
Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, and Yinxiao Li. Maxvit: Multi-axis vision transformer. InEuropean conference on computer vision, pages 459–479. Springer, 2022. 4, 5
work page 2022
-
[47]
Cloudunet: Adapt- ing unet for retrieving cloud properties
Zahid Hassan Tushar, Adeleke Ademakinwa, Jianwu Wang, Zhibo Zhang, and Sanjay Purushotham. Cloudunet: Adapt- ing unet for retrieving cloud properties. InIGARSS 2024 IEEE International Geoscience and Remote Sensing Sympo- sium, pages 7163–7167. IEEE, 2024. 1, 2, 6, 8, 4
work page 2024
-
[48]
Joint retrieval of cloud properties using attention-based deep learning models
Zahid Hassan Tushar, Adeleke Ademakinwa, Jianwu Wang, Zhibo Zhang, and Sanjay Purushotham. Joint retrieval of cloud properties using attention-based deep learning models. InIGARSS 2025-2025 IEEE International Geoscience and Remote Sensing Symposium, pages 4616–4621. IEEE, 2025. 1, 2, 6, 7, 8, 4
work page 2025
-
[49]
Hypersigma: Hyperspectral intelligence comprehen- sion foundation model.PAMI, 2025
Di Wang, Meiqi Hu, Yao Jin, Yuchun Miao, Jiaqi Yang, Yichu Xu, Xiaolei Qin, Jiaqi Ma, Lingyu Sun, Chenxing Li, et al. Hypersigma: Hyperspectral intelligence comprehen- sion foundation model.PAMI, 2025. 1, 2, 5, 6, 8, 3, 4
work page 2025
-
[50]
Quan Wang, Chen Zhou, Xiaoyong Zhuge, Chao Liu, Fuzhong Weng, and Minghuai Wang. Retrieval of cloud properties from thermal infrared radiometry using convolu- tional neural network.Remote Sensing of Environment, 278: 113079, 2022. 6
work page 2022
-
[51]
Yue Wang, Ming Wen, Hailiang Zhang, Jinyu Sun, Qiong Yang, Zhimin Zhang, and Hongmei Lu. Hsimae: A uni- fied masked autoencoder with large-scale pre-training for hy- perspectral image classification.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,
-
[52]
Calipso mission: spaceborne lidar for observation of aerosols and clouds
David M Winker, Jacques R Pelon, and M Patrick Mc- Cormick. Calipso mission: spaceborne lidar for observation of aerosols and clouds. InLidar remote sensing for industry and environment monitoring III, pages 1–11. SPIE, 2003. 2
work page 2003
-
[53]
Foundation models for remote sensing and earth observation: A survey
Aoran Xiao, Weihao Xuan, Junjue Wang, Jiaxing Huang, Dacheng Tao, Shijian Lu, and Naoto Yokoya. Foundation models for remote sensing and earth observation: A survey. IEEE Geoscience and Remote Sensing Magazine, 2025. 2
work page 2025
-
[54]
Shu-wen Yang, Heng-Jui Chang, Zili Huang, Andy T Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, et al. A large-scale evaluation of speech foundation models.IEEE/ACM Trans- actions on Audio, Speech, and Language Processing, 32: 2884–2899, 2024. 1
work page 2024
-
[55]
Low-rank few-shot adaptation of vision-language models
Maxime Zanella and Ismail Ben Ayed. Low-rank few-shot adaptation of vision-language models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1593–1603, 2024. 2
work page 2024
-
[56]
Juanping Zhao, Zenghui Zhang, Wei Yao, Mihai Datcu, Huilin Xiong, and Wenxian Yu. Opensarurban: A sentinel-1 sar image dataset for urban interpretation.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 13:187–203, 2020. 1
work page 2020
-
[57]
Influences of cloud microphysics on the components of solar irradiance in the wrf-solar model
Xin Zhou, Yangang Liu, Yunpeng Shan, Satoshi Endo, Yu Xie, and Manajit Sengupta. Influences of cloud microphysics on the components of solar irradiance in the wrf-solar model. Atmosphere, 15(1):39, 2023. 6, 2
work page 2023
-
[58]
Yanqi Zhou, Tao Lei, Hanxiao Liu, Nan Du, Yanping Huang, Vincent Zhao, Andrew M Dai, Quoc V Le, James Laudon, et al. Mixture-of-experts with expert choice routing.Ad- vances in Neural Information Processing Systems, 35:7103– 7114, 2022. 5
work page 2022
-
[59]
Ar- gus: A compact and versatile foundation model for vision
Weiming Zhuang, Chen Chen, Zhizhong Li, Sina Sajad- manesh, Jingtao Li, Jiabo Huang, Vikash Sehwag, Vivek Sharma, Hirotaka Shinozaki, Felan Carlo Garcia, et al. Ar- gus: A compact and versatile foundation model for vision. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 4418–4429, 2025. 2 HyperFM: An Efficient Hyperspectral...
work page 2025
-
[60]
Notable examples include En- MAP [17], PRISMA [35], and the forthcoming CHIME mission [31]
Our HyperFM250k Dataset Hyperspectral imaging from space offers detailed spectral information about the Earth’s surface and atmosphere, and recent missions have significantly increased the volume and quality of available data. Notable examples include En- MAP [17], PRISMA [35], and the forthcoming CHIME mission [31]. These systems are optimized for land-f...
work page 2024
-
[61]
6 here which were excluded due to space limitation
Additional Results We present additional results from Sec. 6 here which were excluded due to space limitation. We compared with an- other recent hyperspectral foundation model called Hyper- Free [20] by loading their ViT-base weights and adding the convolutional decoder as shown in Fig. 5. Note that we re- move theneckfrom HyperFree ViT-b encoder for fair...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.