pith. sign in

arxiv: 2510.04916 · v2 · submitted 2025-10-06 · 💻 cs.CV

A Hierarchical Self-Consistent Regularization Approach to Satellite Image Time Series Classification

Pith reviewed 2026-05-18 10:01 UTC · model grok-4.3

classification 💻 cs.CV
keywords hierarchical classificationsatellite image time seriesremote sensingdeep learningself-consistent regularizationconsensus mechanism
0
0 comments X

The pith

A hierarchical consensus method uses specialized heads and self-consistent matrices to improve satellite image time series classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a Semantics-Aware Hierarchical Consensus approach that adds multiple classification heads to a deep network, each tuned to a different level of label granularity. Trainable hierarchy matrices guide the learning process so that predictions remain consistent across levels in a self-consistent manner. A consensus step then combines the outputs as a weighted ensemble to align probability distributions. This setup matters for remote sensing because satellite data often carries natural semantic relationships among classes that flat classifiers ignore, and respecting those relationships can produce more coherent and accurate results on tasks with varying resolutions.

Core claim

The central claim is that integrating hierarchy-specific classification heads, trainable hierarchy matrices for self-consistent regularization, and a hierarchical consensus mechanism as a weighted ensemble enables a network to learn hierarchical features and relationships more effectively than standard single-head approaches, leading to improved performance and robustness on remote sensing classification benchmarks.

What carries the argument

The Semantics-Aware Hierarchical Consensus (SAHC) mechanism, which deploys level-specific heads guided by trainable matrices to enforce cross-level consistency and uses a weighted ensemble to align output distributions.

If this is right

  • Performance gains appear on benchmarks that contain different degrees of hierarchical complexity.
  • The method remains effective across varying spectral and spatial resolutions in satellite data.
  • The consensus step functions as a robust ensemble that exploits the structure of the task.
  • Network learning is guided more effectively by the self-consistent regularization than by single-level supervision alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same structure could be tested on other domains that supply label hierarchies, such as ecological or medical image tasks.
  • If the consensus step reduces the effective need for fine-grained labels, it might help in low-data regimes by propagating coarse supervision downward.
  • An ablation that disables the trainable matrices while keeping the heads would isolate how much of the gain comes from the self-consistent regularization.

Load-bearing premise

The supplied label hierarchies accurately reflect real semantic relationships among classes, and forcing consistency through the matrices and consensus will improve generalization instead of causing overfitting to the training data.

What would settle it

An experiment that replaces the given hierarchies with random or mismatched ones and measures whether the SAHC model then underperforms a standard flat classifier on the same data would test the central claim.

Figures

Figures reproduced from arXiv: 2510.04916 by Gianmarco Perantoni, Giulio Weikmann, Lorenzo Bruzzone.

Figure 1
Figure 1. Figure 1: Overview of the proposed hierarchical structure within the backbone [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of both direct and distant hierarchical mappings, from [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: A semicircular hierarchical label tree representation of the Emilia LU dataset at the fine-grained level of the hierarchy. The leaves corresponding to [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Example of qualitative results at the fine-grained level on the HRLC-CCI dataset obtained by the considered methods using the Swin Transformer [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Heatmap of the estimated hierarchy log-joint matrices considering the Swin Transformer backbone on the ELU dataset from fine-grained classes to [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Heatmap of the estimated hierarchy log-joint matrices considering [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
read the original abstract

Deep learning has become increasingly important in remote sensing image classification due to its ability to extract semantic information from complex data. Classification tasks often include predefined label hierarchies that represent the semantic relationships among classes. However, these hierarchies are frequently overlooked, and most approaches focus only on fine-grained classification schemes. In this paper, we present a novel Semantics-Aware Hierarchical Consensus (SAHC) approach to learn hierarchical features and relationships by integrating hierarchy-specific classification heads within a deep network architecture, each specialized in different degrees of class granularity. The proposed approach employs trainable hierarchy matrices, which guide the network through the learning of the hierarchical structure in a self-consistent manner. Furthermore, we introduce a hierarchical consensus mechanism to ensure aligned probability distributions across different hierarchical levels. This mechanism acts as a weighted ensemble being able to effectively leverage the inherent structure of the hierarchical classification task. The proposed SAHC method is evaluated on two benchmark datasets with different degrees of hierarchical complexity on different tasks, considering varying spectral and spatial resolutions. Experimental results show both the effectiveness of the proposed approach in guiding network learning and the robustness of the hierarchical consensus for remote sensing image classification tasks. The codes will be released at https://github.com/rslab-unitrento/sahc.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes a Semantics-Aware Hierarchical Consensus (SAHC) approach for remote sensing image classification. It augments a deep network with hierarchy-specific classification heads at different levels of granularity, introduces trainable hierarchy matrices to enforce self-consistent learning of the supplied label hierarchy, and adds a hierarchical consensus mechanism that aligns probability distributions across heads via a weighted ensemble. The method is evaluated on two benchmark datasets that differ in hierarchy depth and in spectral/spatial resolution; the abstract claims that the results demonstrate both effective guidance of network learning and robustness of the consensus mechanism.

Significance. If the experimental claims hold after proper controls, the work would supply a concrete regularization technique that exploits existing label hierarchies in remote-sensing tasks without requiring new annotations. The planned code release would aid reproducibility. The central contribution, however, rests on the untested premise that the benchmark-supplied hierarchies are semantically meaningful and that the added trainable matrices plus consensus term improve generalization rather than merely increasing model capacity.

major comments (3)
  1. [§4] §4 (Experimental results): the manuscript reports positive outcomes on two benchmark datasets yet supplies no quantitative metrics, baseline comparisons, ablation studies, or error analysis. This absence directly undermines verification of the central claim that SAHC guides learning and that the hierarchical consensus is robust.
  2. [§3.2] §3.2 (Trainable hierarchy matrices and consensus): no experiment tests the key assumption that the supplied label hierarchies encode genuine semantic relationships. Controls such as randomized or permuted hierarchies, or an ablation that retains the extra heads but removes the consensus term, are missing; without them it is impossible to distinguish regularization benefit from added expressivity.
  3. [§4.3] §4.3 (Cross-dataset evaluation): the two datasets differ in hierarchy depth, yet no train-versus-test hierarchy-consistency metric is reported. Such a diagnostic would be required to show that the self-consistent regularization improves generalization rather than memorizing training co-occurrence patterns.
minor comments (2)
  1. [Title / Abstract] The title refers to 'Satellite Image Time Series Classification' while the abstract and method description remain general to remote-sensing imagery; clarify whether temporal modeling is part of the contribution or merely the data modality.
  2. [§3] Notation for the trainable matrices M_l and the consensus weights is introduced without an explicit equation linking them to the loss; a single consolidated equation would improve readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We have carefully reviewed each major comment and provide point-by-point responses below, indicating where revisions will be made to address the concerns raised.

read point-by-point responses
  1. Referee: [§4] §4 (Experimental results): the manuscript reports positive outcomes on two benchmark datasets yet supplies no quantitative metrics, baseline comparisons, ablation studies, or error analysis. This absence directly undermines verification of the central claim that SAHC guides learning and that the hierarchical consensus is robust.

    Authors: We agree that the experimental section requires substantial expansion to allow proper verification of our claims. In the revised manuscript we will add quantitative performance metrics (overall accuracy, mean F1-score, and per-class results), comparisons against relevant baselines (standard CNN architectures, existing hierarchical classification methods, and recent remote-sensing approaches), ablation studies that isolate the contribution of the trainable hierarchy matrices and the consensus term, and an error analysis that examines cases where the hierarchical regularization helps or does not help. These additions will directly support the statements about guiding network learning and consensus robustness. revision: yes

  2. Referee: [§3.2] §3.2 (Trainable hierarchy matrices and consensus): no experiment tests the key assumption that the supplied label hierarchies encode genuine semantic relationships. Controls such as randomized or permuted hierarchies, or an ablation that retains the extra heads but removes the consensus term, are missing; without them it is impossible to distinguish regularization benefit from added expressivity.

    Authors: The referee correctly identifies that the semantic validity of the supplied hierarchies has not been explicitly tested. We will therefore introduce two sets of controls in the revision: (1) experiments that replace the original hierarchies with randomized or permuted versions, and (2) an ablation that retains the additional hierarchy-specific heads but removes the consensus loss term. These experiments will be reported alongside the main results to separate the effect of self-consistent regularization from simple increases in model capacity. revision: yes

  3. Referee: [§4.3] §4.3 (Cross-dataset evaluation): the two datasets differ in hierarchy depth, yet no train-versus-test hierarchy-consistency metric is reported. Such a diagnostic would be required to show that the self-consistent regularization improves generalization rather than memorizing training co-occurrence patterns.

    Authors: We accept that a direct diagnostic of hierarchy consistency across train and test splits is missing. In the revised version we will define and report a hierarchy-consistency metric that quantifies how well the learned inter-level relationships on the training set align with those observed on the test set. This metric will be computed for both datasets and used to argue that the self-consistent regularization promotes generalization rather than memorization of training co-occurrences. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the SAHC architectural proposal

full rationale

The paper proposes a new Semantics-Aware Hierarchical Consensus (SAHC) method that adds hierarchy-specific classification heads, trainable hierarchy matrices, and a weighted consensus mechanism as independent architectural components. These are introduced to guide learning of hierarchical structure rather than being derived from or equivalent to quantities already fitted inside the same model equations. The central claims are supported by experimental results on external benchmark datasets with no reduction of any prediction or uniqueness claim to a self-definition, fitted input renamed as output, or self-citation chain. The derivation is therefore self-contained as an empirical architectural innovation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that supplied label hierarchies are semantically meaningful and that the self-consistent regularization will not introduce harmful biases. No free parameters or invented physical entities are described.

axioms (1)
  • domain assumption Predefined label hierarchies represent meaningful semantic relationships among classes
    Stated in the abstract as frequently overlooked but used as input to the method.
invented entities (1)
  • Trainable hierarchy matrices no independent evidence
    purpose: To guide the network in learning the hierarchical structure in a self-consistent manner
    Introduced as a core component of SAHC; no independent evidence outside the method itself is provided in the abstract.

pith-pipeline@v0.9.0 · 5752 in / 1277 out tokens · 38590 ms · 2026-05-18T10:01:56.186716+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

  1. [1]

    MGSeg: Multiple Granularity-Based Real-Time Semantic Segmentation Network,

    J.-Y . He, S.-H. Liang, X. Wu, B. Zhao, and L. Zhang, “MGSeg: Multiple Granularity-Based Real-Time Semantic Segmentation Network,”IEEE Trans. Image Process., vol. 30, pp. 7200–7214, Aug. 2021

  2. [2]

    Multi-Granularity Part Sampling Attention for Fine-Grained Visual Classification,

    J. Wang, Q. Xu, B. Jiang, B. Luo, and J. Tang, “Multi-Granularity Part Sampling Attention for Fine-Grained Visual Classification,”IEEE Trans. Image Process., vol. 33, pp. 4529–4542, Aug. 2024

  3. [3]

    Multi-Granularity Con- trastive Cross-Modal Collaborative Generation for End-to-End Long- Term Video Question Answering,

    T. Yu, K. Fu, J. Zhang, Q. Huang, and J. Yu, “Multi-Granularity Con- trastive Cross-Modal Collaborative Generation for End-to-End Long- Term Video Question Answering,”IEEE Trans. Image Process., vol. 33, pp. 3115–3129, Apr. 2024

  4. [4]

    Exploiting Related and Unre- lated Tasks for Hierarchical Metric Learning and Image Classification,

    Y . Zheng, J. Fan, J. Zhang, and X. Gao, “Exploiting Related and Unre- lated Tasks for Hierarchical Metric Learning and Image Classification,” IEEE Trans. Image Process., vol. 29, pp. 883–896, Sep. 2020. UNDER REVIEW FOR PUBLICATION ON IEEE TRANSACTIONS ON IMAGE PROCESSING 12

  5. [5]

    Hierarchical Learning of Tree Classifiers for Large-Scale Plant Species Identification,

    J. Fan, N. Zhou, J. Peng, and L. Gao, “Hierarchical Learning of Tree Classifiers for Large-Scale Plant Species Identification,”IEEE Trans. Image Process., vol. 24, no. 11, pp. 4172–4184, Jul. 2015

  6. [6]

    Deep hierarchical semantic segmentation,

    L. Li, T. Zhou, W. Wang, J. Li, and Y . Yang, “Deep hierarchical semantic segmentation,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., New Orleans, LA, USA, Jun. 2022, pp. 1236–1247

  7. [7]

    Object based image analysis for remote sensing,

    T. Blaschke, “Object based image analysis for remote sensing,”ISPRS J. Photogramm. Remote Sens., vol. 65, no. 1, pp. 2–16, Jan. 2010

  8. [8]

    Comparison of Landscape Metrics for Three Different Level Land Cover/Land Use Maps,

    E. Sertel, R. H. Topalo ˘glu, B. S ¸allı, I. Yay Algan, and G. A. Aksu, “Comparison of Landscape Metrics for Three Different Level Land Cover/Land Use Maps,”ISPRS Int. J. Geo-Inf., vol. 7, no. 10, p. 408, Oct. 2018

  9. [9]

    Robust Training of Deep Neural Networks with Weakly Labelled Data,

    G. Perantoni and L. Bruzzone, “Robust Training of Deep Neural Networks with Weakly Labelled Data,” inSignal and Image Processing for Remote Sensing, C. H. Chen, Ed. Boca Raton, FL, USA: CRC Press, 2024, pp. 256–279

  10. [10]

    A review of regional and Global scale Land Use/Land Cover (LULC) mapping products generated from satellite remote sensing,

    Y . Wang, Y . Sun, X. Cao, Y . Wang, W. Zhang, and X. Cheng, “A review of regional and Global scale Land Use/Land Cover (LULC) mapping products generated from satellite remote sensing,”ISPRS J. Photogramm. Remote Sens., vol. 206, pp. 311–334, Dec. 2023

  11. [11]

    Hierarchical Recurrent Neural Hashing for Image Retrieval With Hierarchical Convolutional Features,

    X. Lu, Y . Chen, and X. Li, “Hierarchical Recurrent Neural Hashing for Image Retrieval With Hierarchical Convolutional Features,”IEEE Trans. Image Process., vol. 27, no. 1, pp. 106–120, Jan. 2018

  12. [12]

    Hierarchical Feature Fusion Network for Salient Object Detection,

    X. Li, D. Song, and Y . Dong, “Hierarchical Feature Fusion Network for Salient Object Detection,”IEEE Trans. Image Process., vol. 29, pp. 9165–9175, Sep. 2020

  13. [13]

    Do Convolutional Neural Networks Learn Class Hierarchy?

    A. Bilal, A. Jourabloo, M. Ye, X. Liu, and L. Ren, “Do Convolutional Neural Networks Learn Class Hierarchy?”IEEE Trans. Vis. Comput. Graphics, vol. 24, no. 1, pp. 152–162, Jan. 2018

  14. [14]

    Hierarchical multi-label classification using local neural networks,

    R. Cerri, R. C. Barros, and A. C. de Carvalho, “Hierarchical multi-label classification using local neural networks,”J. Comp. Sys. Sci., vol. 80, no. 1, pp. 39–56, Feb. 2014

  15. [15]

    HD- MTL: Hierarchical Deep Multi-Task Learning for Large-Scale Visual Recognition,

    J. Fan, T. Zhao, Z. Kuang, Y . Zheng, J. Zhang, J. Yu, and J. Peng, “HD- MTL: Hierarchical Deep Multi-Task Learning for Large-Scale Visual Recognition,”IEEE Trans. Image Process., vol. 26, no. 4, pp. 1923– 1938, Apr. 2017

  16. [16]

    Coherent hierarchical multi-label classification networks,

    E. Giunchiglia and T. Lukasiewicz, “Coherent hierarchical multi-label classification networks,”Adv. Neural Inform. Process. Syst., vol. 33, pp. 9662–9673, Dec. 2020

  17. [17]

    Making Better Mistakes: Leveraging Class Hierarchies With Deep Networks,

    L. Bertinetto, R. Mueller, K. Tertikas, S. Samangooei, and N. A. Lord, “Making Better Mistakes: Leveraging Class Hierarchies With Deep Networks,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., Seattle, W A, USA, Jun. 2020, pp. 12 506–12 515

  18. [18]

    Learning Hierarchy Aware Features for Reducing Mistake Severity,

    A. Garg, D. Sani, and S. Anand, “Learning Hierarchy Aware Features for Reducing Mistake Severity,” inProc. Eur. Conf. Comput. Vis., Tel Aviv, Israel, Oct. 2022, pp. 252–267

  19. [19]

    Large-scale object classification using label relation graphs,

    J. Deng, N. Ding, Y . Jia, A. Frome, K. Murphy, S. Bengio, Y . Li, H. Neven, and H. Adam, “Large-scale object classification using label relation graphs,” inProc. Eur. Conf. Comput. Vis., Zurich, Switzerland, Sep. 2014, pp. 48–64

  20. [20]

    Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Clas- sification,

    J. Chen, P. Wang, J. Liu, and Y . Qian, “Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Clas- sification,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., New Orleans, LA, USA, Jun. 2022, pp. 4858–4867

  21. [21]

    Neural-Based Hierarchical Approach for Detailed Dominant Forest Species Classifica- tion by Multispectral Satellite Imagery,

    S. Illarionova, A. Trekin, V . Ignatiev, and I. Oseledets, “Neural-Based Hierarchical Approach for Detailed Dominant Forest Species Classifica- tion by Multispectral Satellite Imagery,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 1810–1820, Dec. 2021

  22. [22]

    Hierarchical classification of sentinel 2-a images for land use and land cover mapping and its use for the CORINE system,

    D. C ¸ . Demirkan, A. Koz, and H. S ¸. D¨uzg¨un, “Hierarchical classification of sentinel 2-a images for land use and land cover mapping and its use for the CORINE system,”J. Appl. Remote Sens., vol. 14, no. 2, p. 026524, Jun. 2020

  23. [23]

    Can a Hierarchi- cal Classification of Sentinel-2 Data Improve Land Cover Mapping?

    A. Wa ´sniewski, A. Ho ´sciło, and M. Chmielewska, “Can a Hierarchi- cal Classification of Sentinel-2 Data Improve Land Cover Mapping?” Remote Sens., vol. 14, no. 4, p. 989, Feb. 2022

  24. [24]

    Your “Flamingo

    D. Chang, K. Pang, Y . Zheng, Z. Ma, Y .-Z. Song, and J. Guo, “Your “Flamingo” is My “Bird”: Fine-Grained, or Not,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., Nashville, TN, USA, Jun. 2021, pp. 11 476–11 485

  25. [25]

    Settles,Active Learning, ser

    B. Settles,Active Learning, ser. Synthesis Lectures on Artificial In- telligence and Machine Learning, R. J. Brachman, W. W.Cohen, and T. Dietterich, Eds. Cham, Switzerland: Springer, 2012

  26. [26]

    A class-driven hierar- chical ResNet for classification of multispectral remote sensing images,

    G. Weikmann, G. Perantoni, and L. Bruzzone, “A class-driven hierar- chical ResNet for classification of multispectral remote sensing images,” inProc. SPIE Image Sign. Proc. Remote Sens. XXIX, vol. 12733, Sep. 2023

  27. [27]

    Log-Linear Pool to Combine Prior Distributions: A Suggestion for a Calibration-Based Approach,

    M. J. Rufo, J. Mart ´ın, and C. J. P ´erez, “Log-Linear Pool to Combine Prior Distributions: A Suggestion for a Calibration-Based Approach,” Bayesian Anal., vol. 7, no. 2, pp. 411 – 438, Jun. 2012

  28. [28]

    Remote Sensing Image Scene Classifi- cation: Benchmark and State of the Art,

    G. Cheng, J. Han, and X. Lu, “Remote Sensing Image Scene Classifi- cation: Benchmark and State of the Art,”Proc. IEEE, vol. 105, no. 10, pp. 1865–1883, 2017

  29. [29]

    On Creating Benchmark Dataset for Aerial Image Inter- pretation: Reviews, Guidances, and Million-AID,

    Y . Long, G.-S. Xia, S. Li, W. Yang, M. Y . Yang, X. X. Zhu, L. Zhang, and D. Li, “On Creating Benchmark Dataset for Aerial Image Inter- pretation: Reviews, Guidances, and Million-AID,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 4205–4230, Apr. 2021

  30. [30]

    ESA-CCI High Resolution Land Cover: Generated Products,

    L. Bruzzone, A. Amodio, F. Bovolo, M. A. Brovelli, M. Corsi, P. De- fourny, P. Gamba, G. Moser, C. Ottl ´e, L. P. Mayoset al., “ESA-CCI High Resolution Land Cover: Generated Products,” inESA Living Planet Symp., Bonn, Germany, May 2022

  31. [31]

    ESA CCI High Resolution Land Cover: Methodology and EO Data Processing Chain,

    L. Bruzzone, F. Bovolo, C. Paris, L. Maggiolo, P. Gamba, G. Moser, G. Perantoni, I. Podsiadlo, D. Solarna, T. Sorriso, M. Zanetti, and K. Meshkini, “ESA CCI High Resolution Land Cover: Methodology and EO Data Processing Chain,” inESA Living Planet Symp., Bonn, Germany, May 2022

  32. [32]

    Training General Representations for Remote Sensing Using in-Domain Knowledge,

    M. Neumann, A. S. Pinto, X. Zhai, and N. Houlsby, “Training General Representations for Remote Sensing Using in-Domain Knowledge,” in Proc. IEEE Int. Geosci. Remote Sens. Symp., Waikoloa, HI, USA, Oct. 2020, pp. 6730–6733

  33. [33]

    Rethinking Vision Transformers for MobileNet Size and Speed,

    Y . Li, J. Hu, Y . Wen, G. Evangelidis, K. Salahi, Y . Wang, S. Tulyakov, and J. Ren, “Rethinking Vision Transformers for MobileNet Size and Speed,” inProc. IEEE Int. Conf. Comput. Vis., Paris, France, Oct. 2023, pp. 16 889–16 900

  34. [34]

    MobileNetV2: Inverted Residuals and Linear Bottlenecks,

    M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” inProc. IEEE Conf. Comput. Vis. Pattern Recog., Salt Lake City, UT, USA, Jun. 2018, pp. 4510–4520

  35. [35]

    LWGANet: A Lightweight Group Attention Backbone for Remote Sensing Visual Tasks,

    W. Lu, S.-B. Chen, C. H. Ding, J. Tang, and B. Luo, “LWGANet: A Lightweight Group Attention Backbone for Remote Sensing Visual Tasks,”arXiv:2501.10040, 2025

  36. [36]

    Self-attention for raw optical Satellite Time Series Classification,

    M. Rußwurm and M. K ¨orner, “Self-attention for raw optical Satellite Time Series Classification,”ISPRS J. Photogramm. Remote Sens., vol. 169, pp. 421–435, Nov. 2020

  37. [37]

    RSTT: Real-time Spatial Temporal Transformer for Space-Time Video Super-Resolution,

    Z. Geng, L. Liang, T. Ding, and I. Zharkov, “RSTT: Real-time Spatial Temporal Transformer for Space-Time Video Super-Resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., New Orleans, LA, USA, Jun. 2022, pp. 17 441–17 451

  38. [38]

    Convolutional LSTM network: A machine learning approach for precipitation nowcasting,

    S. Xingjian, Z. Chen, H. Wang, D.-Y . Yeung, W.-K. Wong, and W.-c. Woo, “Convolutional LSTM network: A machine learning approach for precipitation nowcasting,” inAdv. Neural Inform. Process. Syst., vol. 28, Montreal, Canada, Dec. 2015, pp. 802–810