pith. sign in

arxiv: 2512.16294 · v2 · submitted 2025-12-18 · 💻 cs.CV

MARC: Multi-Label Adaptive Retrieval Contrastive Loss for Remote Sensing Images

Pith reviewed 2026-05-16 21:50 UTC · model grok-4.3

classification 💻 cs.CV
keywords multi-label retrievalcontrastive learningremote sensingimage retrievallabel imbalanceadaptive samplingrepresentation learning
0
0 comments X

The pith

MACL extends contrastive learning with label-aware sampling, frequency weighting, and dynamic temperatures to balance multi-label remote sensing retrieval across rare and common categories.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Multi-Label Adaptive Contrastive Learning (MACL) to tackle semantic overlap, highly imbalanced label distributions, and complex co-occurrences in multi-label remote-sensing image retrieval. It modifies standard contrastive learning by adding label-aware sampling to select relevant pairs, frequency-sensitive weighting to emphasize rare labels, and dynamic-temperature scaling to adjust loss sharpness according to label prevalence. These changes aim to produce more equitable representations that do not neglect infrequent land-cover classes. Experiments across DLRSD, ML-AID, and WHDLD benchmarks show MACL outperforming prior contrastive baselines in retrieval metrics. The result matters because large remote-sensing archives contain many rare but important categories whose poor retrieval limits practical Earth observation analysis.

Core claim

MACL is introduced as an extension of contrastive learning that integrates label-aware sampling, frequency-sensitive weighting, and dynamic-temperature scaling to achieve balanced representation learning across both common and rare categories, thereby mitigating semantic imbalance and yielding more reliable retrieval performance on multi-label remote-sensing datasets.

What carries the argument

The MACL loss, which augments standard contrastive loss through label-aware sampling of positives and negatives, frequency-sensitive weighting that upweights rare labels, and dynamic-temperature scaling that varies the temperature parameter based on label frequency.

If this is right

  • MACL reduces the performance gap between frequent and rare land-cover categories in retrieval tasks.
  • The method produces more reliable rankings from large-scale multi-label archives.
  • It enables contrastive frameworks to exploit multi-label annotations without being dominated by common classes.
  • The approach supports deployment in operational remote-sensing retrieval systems where label imbalance is typical.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same sampling-weighting-scaling pattern could be tested on multi-label medical imaging or document retrieval tasks that share similar imbalance problems.
  • Releasing the code makes it possible to measure how much each component contributes on datasets with different label co-occurrence statistics.
  • Dynamic temperature scaling might transfer to other contrastive or metric-learning losses outside remote sensing.

Load-bearing premise

The three added components deliver generalizable improvements on remote-sensing datasets beyond the three benchmarks without needing dataset-specific retuning.

What would settle it

Evaluating the full MACL pipeline on a new remote-sensing dataset never used in the original experiments and checking whether retrieval metrics remain superior to the same contrastive baselines.

Figures

Figures reproduced from arXiv: 2512.16294 by Amna Amir, Erchan Aptoula.

Figure 1
Figure 1. Figure 1: Illustration of the MulSupCon method. Each row represents a sample’s one-hot label vector, with the first [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Training and retrieval pipeline employing the proposed MACL loss. During training, images are encoded by a [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Visualization of the MACL mechanism. For each anchor label, MACL computes a label-specific gradient [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Examples of data augmentations applied to the DLRSD dataset. [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 7
Figure 7. Figure 7: Top-6 retrieved images for the given query on the DLRSD dataset, comparing the best-performing methods [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Loss Function Comparison Across Metrics (All Datasets). The evaluated loss functions are color-coded [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Ablation study on the role of Tip (parameterized by α and β) and wip in the proposed MACL loss. Subfigures (a) and (b) show mAP performance as 3D surfaces (with colorbars indicating mAP in %), while (c) and (d) show 2D mAP curves for varying α and β, respectively: (a) mAP Performance vs α and β (Tip only), (b) mAP Performance vs wip and Tip (α + β), (c) mAP vs wip and α, and (d) mAP vs wip and β. When eith… view at source ↗
Figure 10
Figure 10. Figure 10: Ablation study of the temperature parameter [PITH_FULL_IMAGE:figures/full_fig_p016_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Effect of learning rate on MACL and Weighted MACL across datasets. Smaller learning rates generally [PITH_FULL_IMAGE:figures/full_fig_p017_11.png] view at source ↗
read the original abstract

Semantic overlap among land-cover categories, highly imbalanced label distributions, and complex inter-class co-occurrence patterns constitute significant challenges for multi-label remote-sensing image retrieval. In this article, Multi-Label Adaptive Contrastive Learning (MACL) is introduced as an extension of contrastive learning to address them. It integrates label-aware sampling, frequency-sensitive weighting, and dynamic-temperature scaling to achieve balanced representation learning across both common and rare categories. Extensive experiments on three benchmark datasets (DLRSD, ML-AID, and WHDLD), show that MACL consistently outperforms contrastive-loss based baselines, effectively mitigating semantic imbalance and delivering more reliable retrieval performance in large-scale remote-sensing archives. Code, pretrained models, and evaluation scripts will be released at https://github.com/Amna-128/MARC upon acceptance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes Multi-Label Adaptive Contrastive Learning (MACL) as an extension of standard contrastive loss for multi-label remote-sensing image retrieval. It combines label-aware sampling, frequency-sensitive weighting, and dynamic-temperature scaling to mitigate semantic overlap, label imbalance, and co-occurrence issues, claiming consistent outperformance over contrastive-loss baselines on the DLRSD, ML-AID, and WHDLD datasets.

Significance. If the reported gains hold under broader testing, the work addresses a practically relevant problem in large-scale RS archives where rare land-cover classes and complex label co-occurrences degrade retrieval reliability. Releasing code, pretrained models, and scripts would support reproducibility and adoption.

major comments (2)
  1. [Experiments] Experiments section: All quantitative results are confined to the three chosen benchmarks (DLRSD, ML-AID, WHDLD). No cross-dataset transfer experiments, no evaluation on an external remote-sensing corpus, and no fixed-hyperparameter tests on held-out collections are reported. This leaves open whether the three proposed components produce generalizable improvements or merely exploit the specific label-frequency and co-occurrence statistics of these datasets.
  2. [Section 3 and Experiments] Section 3 (Method) and Experiments: The central claim that the three components together deliver balanced representation learning requires ablation results that isolate the contribution of each (label-aware sampling, frequency-sensitive weighting, dynamic-temperature scaling) with statistical significance tests. Without these, it is difficult to confirm that the observed retrieval gains are attributable to the adaptive mechanism rather than to dataset-specific tuning.
minor comments (2)
  1. [Abstract] Abstract: The claim of outperformance should be accompanied by at least one quantitative metric (e.g., mAP or recall@K) even in the abstract to allow immediate assessment of effect size.
  2. [Throughout] Notation: Ensure consistent expansion of the acronym MACL versus MARC throughout the text and figures.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects of generalizability and component validation. We will revise the manuscript to incorporate additional experiments addressing both points, thereby strengthening the claims regarding the effectiveness of MARC.

read point-by-point responses
  1. Referee: [Experiments] Experiments section: All quantitative results are confined to the three chosen benchmarks (DLRSD, ML-AID, WHDLD). No cross-dataset transfer experiments, no evaluation on an external remote-sensing corpus, and no fixed-hyperparameter tests on held-out collections are reported. This leaves open whether the three proposed components produce generalizable improvements or merely exploit the specific label-frequency and co-occurrence statistics of these datasets.

    Authors: We acknowledge the limitation that all reported results are on the three chosen benchmarks. These datasets are standard in the multi-label remote-sensing retrieval literature and were selected for their complementary challenges in label imbalance and co-occurrence. To demonstrate broader applicability, the revised manuscript will include cross-dataset transfer experiments (training on one benchmark and evaluating on the others) as well as results with fixed hyperparameters on held-out collections. We will also add a brief discussion of benchmark representativeness in the Experiments section. revision: yes

  2. Referee: [Section 3 and Experiments] Section 3 (Method) and Experiments: The central claim that the three components together deliver balanced representation learning requires ablation results that isolate the contribution of each (label-aware sampling, frequency-sensitive weighting, dynamic-temperature scaling) with statistical significance tests. Without these, it is difficult to confirm that the observed retrieval gains are attributable to the adaptive mechanism rather than to dataset-specific tuning.

    Authors: We agree that isolating each component's contribution is necessary. The current manuscript presents overall performance comparisons but does not include component-wise ablations. In the revised version we will add a dedicated ablation study in the Experiments section that systematically removes or disables label-aware sampling, frequency-sensitive weighting, and dynamic-temperature scaling one at a time. We will also report statistical significance tests (paired t-tests or Wilcoxon signed-rank tests) across multiple random seeds on the primary retrieval metrics to confirm that the observed gains are attributable to the proposed adaptive mechanisms rather than tuning. revision: yes

Circularity Check

0 steps flagged

No circularity in MACL derivation chain

full rationale

The paper defines MACL explicitly as an extension of standard contrastive loss by adding three independent components (label-aware sampling, frequency-sensitive weighting, dynamic-temperature scaling). No equations reduce any claimed improvement to a fitted quantity defined by the same data, no self-citations serve as load-bearing uniqueness theorems, and no ansatz or renaming collapses the derivation by construction. The central claim rests on empirical outperformance on three benchmarks rather than on any self-referential reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the work rests on standard contrastive-learning assumptions plus the three named adaptive mechanisms whose precise formulations are not detailed here.

pith-pipeline@v0.9.0 · 5431 in / 999 out tokens · 31613 ms · 2026-05-16T21:50:07.410173+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages

  1. [1]

    Liping Di and E. Yu. Big data analytics for remote sensing: Concepts and standards. InRemote Sensing Big Data, Springer Remote Sensing/Photogrammetry. Springer, Cham, 2023

  2. [2]

    Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z. Wang. Image retrieval: Ideas, influences, and trends of the new age.ACM Computing Surveys (CSUR), 40(2):1–60, 2008

  3. [3]

    Weixun Zhou, Haiyan Guan, Ziyu Li, Zhenfeng Shao, and Mahmoud R. Delavar. Remote sensing image retrieval in the past decade: Achievements, challenges, and future directions.IEEE JSTARS, 16(1), 2023

  4. [4]

    Sumbul, M

    G. Sumbul, M. Ravanbakhsh, and B. Demir. A relevant, hard and diverse triplet sampling method for multi-label remote sensing image retrieval. InProceedings of the IEEE Mediterranean and Middle-East Geoscience and Remote Sensing Symposium (M2GARSS), pages 1–5, 2022

  5. [5]

    Sumbul, M

    G. Sumbul, M. Ravanbakhsh, and B. Demir. Informative and representative triplet selection for multilabel remote sensing image retrieval.IEEE TGRS, 60:1–11, 2022

  6. [6]

    Imbriaco, C

    R. Imbriaco, C. Sebastian, E. Bondarev, and P. H. N. de With. Toward multilabel image retrieval for remote sensing.IEEE TGRS, 60, July 2022

  7. [7]

    Z. Shao, W. Zhou, X. Deng, M. Zhang, and Q. Cheng. Multilabel remote sensing image retrieval based on fully convolutional network.IEEE JSTARS, 13:318–328, 2020

  8. [8]

    Han et al

    L. Han et al. Hash-based remote sensing image retrieval.IEEE TGRS, 62:1–23, 2024

  9. [9]

    Peng Li, Lirong Han, Xuanwen Tao, Xiaoyu Zhang, Christos Grecos, Antonio Plaza, and Peng Ren. Hashing nets for hashing: A quantized deep learning to hash framework for remote sensing image retrieval.IEEE Transactions on Geoscience and Remote Sensing, 58(10):7331–7345, October 2020

  10. [10]

    J. Kang, R. Fernandez-Beltran, D. Hong, J. Chanussot, and A. Plaza. Graph relation network: Modeling relations between scenes for multilabel remote-sensing image classification and retrieval.IEEE TGRS, 59(5):4355–4369, May 2021

  11. [11]

    A simple framework for contrastive learning of visual representations

    Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InProceedings of the 37th International Conference on Machine Learning. PMLR, 2020. 18 MACL: Multi-Label Adaptive Contrastive Learning Loss for Remote Sensing Image Retrieval

  12. [12]

    Dynamic-manifold-based sample selection in contrastive learning for remote sensing image retrieval.The Visual Computer, 41:4111–4127, 2025

    Qiyang Liu, Yun Ge, Sijia Wang, Ting Wang, and Jinlong Xu. Dynamic-manifold-based sample selection in contrastive learning for remote sensing image retrieval.The Visual Computer, 41:4111–4127, 2025

  13. [13]

    Supervised contrastive learning based on fusion of global and local features for remote sensing image retrieval.IEEE TGRS, 61:1–15, 2023

    Mengluan Huang, Le Dong, Weisheng Dong, and Guangming Shi. Supervised contrastive learning based on fusion of global and local features for remote sensing image retrieval.IEEE TGRS, 61:1–15, 2023

  14. [14]

    Khosla, P

    P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y . Tian, P. Isola, A. Maschinot, C. Liu, and D. Krishnan. Supervised contrastive learning.Advances in Neural Information Processing Systems (NeurIPS), 2021

  15. [15]

    Zhang and M

    P. Zhang and M. Wu. Multi-label supervised contrastive learning. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 16786–16793, Vancouver, Canada, 2024

  16. [16]

    Similarity-dissimilarity loss for multi-label supervised contrastive learning.arXiv preprint arXiv:2410.13439, 2024

    Guangming Huang, Yunfei Long, Cunjin Luo, and Sheng Liu. Similarity-dissimilarity loss for multi-label supervised contrastive learning.arXiv preprint arXiv:2410.13439, 2024

  17. [17]

    Multi-label contrastive learning: A comprehensive study

    Alexandre Audibert, Aurélien Gauffre, and Massih-Reza Amini. Multi-label contrastive learning: A comprehensive study. InProceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), Ghent, Belgium, 2024. Springer

  18. [18]

    Zhang, Q

    M. Zhang, Q. Cheng, F. Luo, and L. Ye. A triplet nonlocal neural network with dual anchor triplet loss for high resolution remote sensing image retrieval.IEEE JSTARS, 14:2711–2723, 2021

  19. [19]

    Enhancing remote sensing image retrieval using a triplet deep metric learning network.International Journal of Remote Sensing, 41(2):740–751, 2019

    Rui Cao, Qian Zhang, Jiasong Zhu, Qing Li, Qingquan Li, Bozhi Liu, and Guoping Qiu. Enhancing remote sensing image retrieval using a triplet deep metric learning network.International Journal of Remote Sensing, 41(2):740–751, 2019

  20. [20]

    H. Zhao, L. Yuan, H. Zhao, and Z. Wang. Global aware ranking deep metric learning for remote sensing image retrieval.IEEE GRSL, 19:8008505, 2022

  21. [21]

    L. Fan, H. Zhao, and H. Zhao. Global optimization: Combining local loss with result ranking loss in remote sensing image retrieval.IEEE TGRS, 59(8):7011–7026, 2021

  22. [22]

    Y . Wang, S. Ji, and Y . Zhang. A learnable joint spatial and spectral transformation for high resolution remote sensing image retrieval.IEEE JSTARS, 14:8100–8112, 2021

  23. [23]

    S. Wang, D. Hou, and H. Xing. A novel multi attention fusion network with dilated convolution and label smoothing for remote sensing image retrieval.International Journal of Remote Sensing, 43(4):1306–1322, 2022

  24. [24]

    Chaudhuri, B

    U. Chaudhuri, B. Banerjee, A. Bhattacharya, and M. Datcu. Attention driven graph convolution network for remote sensing image retrieval.IEEE GRSL, 19:8019705, 2022

  25. [25]

    A self-attention feature metric learning method for remote sensing image retrieval

    Jiahui Wu, Zhuowei Wang, Genping Zhao, and Shuo Qu. A self-attention feature metric learning method for remote sensing image retrieval. InProceedings of the 2023 15th International Conference on Machine Learning and Computing (ICMLC ’23), pages 399–405, 2023

  26. [26]

    W. Song, Z. Gao, R. Dian, P. Ghamisi, Y . Zhang, and J. A. Benediktsson. Asymmetric hash code learning for remote sensing image retrieval.IEEE TGRS, 60:5617514, 2022

  27. [27]

    Tang et al

    X. Tang et al. Meta hashing for remote sensing image retrieval.IEEE TGRS, 60:5615419, 2022

  28. [28]

    Adaptive hash code balancing for remote sensing image retrieval.International Journal of Remote Sensing, 44(2):690–712, 2023

    Rui Wang, Jian Zheng, Wei Zhou, Qi Wang, Yan Lu, and Yujian Tao. Adaptive hash code balancing for remote sensing image retrieval.International Journal of Remote Sensing, 44(2):690–712, 2023

  29. [29]

    Sumbul and B

    G. Sumbul and B. Demir. Plasticity stability preserving multi task learning for remote sensing image retrieval. IEEE TGRS, 60:5620116, 2022

  30. [30]

    Z. Li, M. Chen, and K. Huang. Centripetal intensive deep hashing for remote sensing image retrieval.IEEE JSTARS, 18:12439–12450, 2025

  31. [31]

    Remote sensing image retrieval by deep attention hashing with distance-adaptive ranking.IEEE JSTARS, 16:4301–4314, 2023

    Yichao Zhang, Xiangtao Zheng, and Xiaoqiang Lu. Remote sensing image retrieval by deep attention hashing with distance-adaptive ranking.IEEE JSTARS, 16:4301–4314, 2023

  32. [32]

    W. Zhou, X. Deng, and Z. Shao. Region convolutional features for multi-label remote sensing image retrieval. arXiv preprint arXiv:1807.08634, 2020

  33. [33]

    Generative contrastive learning for multi-label image classification

    ShengWu Fu, ZhengShen Gu, Dong Wang, and Songhua Xu. Generative contrastive learning for multi-label image classification. InProceedings of the 2024 International Symposium on Digital Home (ISDH), pages 1–6. IEEE, 2024

  34. [34]

    Chaudhuri, B

    B. Chaudhuri, B. Demir, S. Chaudhuri, and L. Bruzzone. Multilabel remote sensing image retrieval using a semisupervised graph-theoretic method.IEEE TGRS, 56(2):1144–1158, 2018

  35. [35]

    Sumbul and B

    G. Sumbul and B. Demir. A novel graph-theoretic deep representation learning method for multi-label remote sensing image retrieval. InProceedings of the IEEE IGARSS, pages 266–269, 2021. 19 MACL: Multi-Label Adaptive Contrastive Learning Loss for Remote Sensing Image Retrieval

  36. [36]

    O. E. Dai, B. Demir, B. Sankur, and L. Bruzzone. A novel system for content based retrieval of multi-label remote sensing images. InProc. IEEE IGARSS, pages 1744–1747, 2017

  37. [37]

    Y . Hua, X. X. Zhu, and L. Mou. A pairwise label inference network for multi-label aerial image analysis.IEEE TGRS, 2021

  38. [38]

    Exploring contrastive learning for long-tailed multi-label text classification

    Alexandre Audibert, Aurélien Gauffre, and Massih-Reza Amini. Exploring contrastive learning for long-tailed multi-label text classification. InEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pages 245–261, 2024

  39. [39]

    Class prototypes based contrastive learning for classifying multi-label and fine-grained educational videos

    Rohit Gupta, Anirban Roy, Claire Christensen, Sujeong Kim, Sarah Gerard, Madeline Cincebeaux, Ajay Divakaran, Todd Grindal, and Mubarak Shah. Class prototypes based contrastive learning for classifying multi-label and fine-grained educational videos. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19923–19933, 2023

  40. [40]

    Chaudhuri, B

    B. Chaudhuri, B. Demir, S. Chaudhuri, and L. Bruzzone. Multilabel remote sensing image retrieval using a semi supervised graph-theoretic model.IEEE TGRS, 56(2):1144–1158, 2018

  41. [41]

    Y . Hua, L. Mou, and X. X. Zhu. Relation network for multilabel aerial image classification.IEEE TGRS, 58(7):4554–4570, 2020

  42. [42]

    Deep continual hashing for real-world multi-label image retrieval.Expert Systems with Applications, 2023

    Yuan Cao, Xiangru Chen, Zifan Liu, Wenzhe Jia, Fanlei Meng, and Jie Gui. Deep continual hashing for real-world multi-label image retrieval.Expert Systems with Applications, 2023

  43. [43]

    Amir and E

    A. Amir and E. Aptoula. A comparative study of multi-label supervised contrastive losses for the content-based image retrieval of remote sensing images. InProceedings of the 2025 33rd Signal Processing and Communications Applications Conference (SIU), pages 1–4. IEEE, 2025. 20