MARC: Multi-Label Adaptive Retrieval Contrastive Loss for Remote Sensing Images
Pith reviewed 2026-05-16 21:50 UTC · model grok-4.3
The pith
MACL extends contrastive learning with label-aware sampling, frequency weighting, and dynamic temperatures to balance multi-label remote sensing retrieval across rare and common categories.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
MACL is introduced as an extension of contrastive learning that integrates label-aware sampling, frequency-sensitive weighting, and dynamic-temperature scaling to achieve balanced representation learning across both common and rare categories, thereby mitigating semantic imbalance and yielding more reliable retrieval performance on multi-label remote-sensing datasets.
What carries the argument
The MACL loss, which augments standard contrastive loss through label-aware sampling of positives and negatives, frequency-sensitive weighting that upweights rare labels, and dynamic-temperature scaling that varies the temperature parameter based on label frequency.
If this is right
- MACL reduces the performance gap between frequent and rare land-cover categories in retrieval tasks.
- The method produces more reliable rankings from large-scale multi-label archives.
- It enables contrastive frameworks to exploit multi-label annotations without being dominated by common classes.
- The approach supports deployment in operational remote-sensing retrieval systems where label imbalance is typical.
Where Pith is reading between the lines
- The same sampling-weighting-scaling pattern could be tested on multi-label medical imaging or document retrieval tasks that share similar imbalance problems.
- Releasing the code makes it possible to measure how much each component contributes on datasets with different label co-occurrence statistics.
- Dynamic temperature scaling might transfer to other contrastive or metric-learning losses outside remote sensing.
Load-bearing premise
The three added components deliver generalizable improvements on remote-sensing datasets beyond the three benchmarks without needing dataset-specific retuning.
What would settle it
Evaluating the full MACL pipeline on a new remote-sensing dataset never used in the original experiments and checking whether retrieval metrics remain superior to the same contrastive baselines.
Figures
read the original abstract
Semantic overlap among land-cover categories, highly imbalanced label distributions, and complex inter-class co-occurrence patterns constitute significant challenges for multi-label remote-sensing image retrieval. In this article, Multi-Label Adaptive Contrastive Learning (MACL) is introduced as an extension of contrastive learning to address them. It integrates label-aware sampling, frequency-sensitive weighting, and dynamic-temperature scaling to achieve balanced representation learning across both common and rare categories. Extensive experiments on three benchmark datasets (DLRSD, ML-AID, and WHDLD), show that MACL consistently outperforms contrastive-loss based baselines, effectively mitigating semantic imbalance and delivering more reliable retrieval performance in large-scale remote-sensing archives. Code, pretrained models, and evaluation scripts will be released at https://github.com/Amna-128/MARC upon acceptance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Multi-Label Adaptive Contrastive Learning (MACL) as an extension of standard contrastive loss for multi-label remote-sensing image retrieval. It combines label-aware sampling, frequency-sensitive weighting, and dynamic-temperature scaling to mitigate semantic overlap, label imbalance, and co-occurrence issues, claiming consistent outperformance over contrastive-loss baselines on the DLRSD, ML-AID, and WHDLD datasets.
Significance. If the reported gains hold under broader testing, the work addresses a practically relevant problem in large-scale RS archives where rare land-cover classes and complex label co-occurrences degrade retrieval reliability. Releasing code, pretrained models, and scripts would support reproducibility and adoption.
major comments (2)
- [Experiments] Experiments section: All quantitative results are confined to the three chosen benchmarks (DLRSD, ML-AID, WHDLD). No cross-dataset transfer experiments, no evaluation on an external remote-sensing corpus, and no fixed-hyperparameter tests on held-out collections are reported. This leaves open whether the three proposed components produce generalizable improvements or merely exploit the specific label-frequency and co-occurrence statistics of these datasets.
- [Section 3 and Experiments] Section 3 (Method) and Experiments: The central claim that the three components together deliver balanced representation learning requires ablation results that isolate the contribution of each (label-aware sampling, frequency-sensitive weighting, dynamic-temperature scaling) with statistical significance tests. Without these, it is difficult to confirm that the observed retrieval gains are attributable to the adaptive mechanism rather than to dataset-specific tuning.
minor comments (2)
- [Abstract] Abstract: The claim of outperformance should be accompanied by at least one quantitative metric (e.g., mAP or recall@K) even in the abstract to allow immediate assessment of effect size.
- [Throughout] Notation: Ensure consistent expansion of the acronym MACL versus MARC throughout the text and figures.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which highlight important aspects of generalizability and component validation. We will revise the manuscript to incorporate additional experiments addressing both points, thereby strengthening the claims regarding the effectiveness of MARC.
read point-by-point responses
-
Referee: [Experiments] Experiments section: All quantitative results are confined to the three chosen benchmarks (DLRSD, ML-AID, WHDLD). No cross-dataset transfer experiments, no evaluation on an external remote-sensing corpus, and no fixed-hyperparameter tests on held-out collections are reported. This leaves open whether the three proposed components produce generalizable improvements or merely exploit the specific label-frequency and co-occurrence statistics of these datasets.
Authors: We acknowledge the limitation that all reported results are on the three chosen benchmarks. These datasets are standard in the multi-label remote-sensing retrieval literature and were selected for their complementary challenges in label imbalance and co-occurrence. To demonstrate broader applicability, the revised manuscript will include cross-dataset transfer experiments (training on one benchmark and evaluating on the others) as well as results with fixed hyperparameters on held-out collections. We will also add a brief discussion of benchmark representativeness in the Experiments section. revision: yes
-
Referee: [Section 3 and Experiments] Section 3 (Method) and Experiments: The central claim that the three components together deliver balanced representation learning requires ablation results that isolate the contribution of each (label-aware sampling, frequency-sensitive weighting, dynamic-temperature scaling) with statistical significance tests. Without these, it is difficult to confirm that the observed retrieval gains are attributable to the adaptive mechanism rather than to dataset-specific tuning.
Authors: We agree that isolating each component's contribution is necessary. The current manuscript presents overall performance comparisons but does not include component-wise ablations. In the revised version we will add a dedicated ablation study in the Experiments section that systematically removes or disables label-aware sampling, frequency-sensitive weighting, and dynamic-temperature scaling one at a time. We will also report statistical significance tests (paired t-tests or Wilcoxon signed-rank tests) across multiple random seeds on the primary retrieval metrics to confirm that the observed gains are attributable to the proposed adaptive mechanisms rather than tuning. revision: yes
Circularity Check
No circularity in MACL derivation chain
full rationale
The paper defines MACL explicitly as an extension of standard contrastive loss by adding three independent components (label-aware sampling, frequency-sensitive weighting, dynamic-temperature scaling). No equations reduce any claimed improvement to a fitted quantity defined by the same data, no self-citations serve as load-bearing uniqueness theorems, and no ansatz or renaming collapses the derivation by construction. The central claim rests on empirical outperformance on three benchmarks rather than on any self-referential reduction.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
MACL loss ... wip = 1 / log(1 + f(y(i), y(p))) + ϵ ... Tip = exp(−α J(y(i), y(p))) + β (1 / log(1 + h(y(i)))) ... LMACL = ∑ ... wip log exp(sip / Tip) / ...
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Extensive experiments on three benchmark datasets (DLRSD, ML-AID, and WHDLD)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Liping Di and E. Yu. Big data analytics for remote sensing: Concepts and standards. InRemote Sensing Big Data, Springer Remote Sensing/Photogrammetry. Springer, Cham, 2023
work page 2023
-
[2]
Ritendra Datta, Dhiraj Joshi, Jia Li, and James Z. Wang. Image retrieval: Ideas, influences, and trends of the new age.ACM Computing Surveys (CSUR), 40(2):1–60, 2008
work page 2008
-
[3]
Weixun Zhou, Haiyan Guan, Ziyu Li, Zhenfeng Shao, and Mahmoud R. Delavar. Remote sensing image retrieval in the past decade: Achievements, challenges, and future directions.IEEE JSTARS, 16(1), 2023
work page 2023
- [4]
- [5]
-
[6]
R. Imbriaco, C. Sebastian, E. Bondarev, and P. H. N. de With. Toward multilabel image retrieval for remote sensing.IEEE TGRS, 60, July 2022
work page 2022
-
[7]
Z. Shao, W. Zhou, X. Deng, M. Zhang, and Q. Cheng. Multilabel remote sensing image retrieval based on fully convolutional network.IEEE JSTARS, 13:318–328, 2020
work page 2020
- [8]
-
[9]
Peng Li, Lirong Han, Xuanwen Tao, Xiaoyu Zhang, Christos Grecos, Antonio Plaza, and Peng Ren. Hashing nets for hashing: A quantized deep learning to hash framework for remote sensing image retrieval.IEEE Transactions on Geoscience and Remote Sensing, 58(10):7331–7345, October 2020
work page 2020
-
[10]
J. Kang, R. Fernandez-Beltran, D. Hong, J. Chanussot, and A. Plaza. Graph relation network: Modeling relations between scenes for multilabel remote-sensing image classification and retrieval.IEEE TGRS, 59(5):4355–4369, May 2021
work page 2021
-
[11]
A simple framework for contrastive learning of visual representations
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InProceedings of the 37th International Conference on Machine Learning. PMLR, 2020. 18 MACL: Multi-Label Adaptive Contrastive Learning Loss for Remote Sensing Image Retrieval
work page 2020
-
[12]
Qiyang Liu, Yun Ge, Sijia Wang, Ting Wang, and Jinlong Xu. Dynamic-manifold-based sample selection in contrastive learning for remote sensing image retrieval.The Visual Computer, 41:4111–4127, 2025
work page 2025
-
[13]
Mengluan Huang, Le Dong, Weisheng Dong, and Guangming Shi. Supervised contrastive learning based on fusion of global and local features for remote sensing image retrieval.IEEE TGRS, 61:1–15, 2023
work page 2023
- [14]
-
[15]
P. Zhang and M. Wu. Multi-label supervised contrastive learning. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 16786–16793, Vancouver, Canada, 2024
work page 2024
-
[16]
Guangming Huang, Yunfei Long, Cunjin Luo, and Sheng Liu. Similarity-dissimilarity loss for multi-label supervised contrastive learning.arXiv preprint arXiv:2410.13439, 2024
-
[17]
Multi-label contrastive learning: A comprehensive study
Alexandre Audibert, Aurélien Gauffre, and Massih-Reza Amini. Multi-label contrastive learning: A comprehensive study. InProceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), Ghent, Belgium, 2024. Springer
work page 2024
- [18]
-
[19]
Rui Cao, Qian Zhang, Jiasong Zhu, Qing Li, Qingquan Li, Bozhi Liu, and Guoping Qiu. Enhancing remote sensing image retrieval using a triplet deep metric learning network.International Journal of Remote Sensing, 41(2):740–751, 2019
work page 2019
-
[20]
H. Zhao, L. Yuan, H. Zhao, and Z. Wang. Global aware ranking deep metric learning for remote sensing image retrieval.IEEE GRSL, 19:8008505, 2022
work page 2022
-
[21]
L. Fan, H. Zhao, and H. Zhao. Global optimization: Combining local loss with result ranking loss in remote sensing image retrieval.IEEE TGRS, 59(8):7011–7026, 2021
work page 2021
-
[22]
Y . Wang, S. Ji, and Y . Zhang. A learnable joint spatial and spectral transformation for high resolution remote sensing image retrieval.IEEE JSTARS, 14:8100–8112, 2021
work page 2021
-
[23]
S. Wang, D. Hou, and H. Xing. A novel multi attention fusion network with dilated convolution and label smoothing for remote sensing image retrieval.International Journal of Remote Sensing, 43(4):1306–1322, 2022
work page 2022
-
[24]
U. Chaudhuri, B. Banerjee, A. Bhattacharya, and M. Datcu. Attention driven graph convolution network for remote sensing image retrieval.IEEE GRSL, 19:8019705, 2022
work page 2022
-
[25]
A self-attention feature metric learning method for remote sensing image retrieval
Jiahui Wu, Zhuowei Wang, Genping Zhao, and Shuo Qu. A self-attention feature metric learning method for remote sensing image retrieval. InProceedings of the 2023 15th International Conference on Machine Learning and Computing (ICMLC ’23), pages 399–405, 2023
work page 2023
-
[26]
W. Song, Z. Gao, R. Dian, P. Ghamisi, Y . Zhang, and J. A. Benediktsson. Asymmetric hash code learning for remote sensing image retrieval.IEEE TGRS, 60:5617514, 2022
work page 2022
-
[27]
X. Tang et al. Meta hashing for remote sensing image retrieval.IEEE TGRS, 60:5615419, 2022
work page 2022
-
[28]
Rui Wang, Jian Zheng, Wei Zhou, Qi Wang, Yan Lu, and Yujian Tao. Adaptive hash code balancing for remote sensing image retrieval.International Journal of Remote Sensing, 44(2):690–712, 2023
work page 2023
-
[29]
G. Sumbul and B. Demir. Plasticity stability preserving multi task learning for remote sensing image retrieval. IEEE TGRS, 60:5620116, 2022
work page 2022
-
[30]
Z. Li, M. Chen, and K. Huang. Centripetal intensive deep hashing for remote sensing image retrieval.IEEE JSTARS, 18:12439–12450, 2025
work page 2025
-
[31]
Yichao Zhang, Xiangtao Zheng, and Xiaoqiang Lu. Remote sensing image retrieval by deep attention hashing with distance-adaptive ranking.IEEE JSTARS, 16:4301–4314, 2023
work page 2023
- [32]
-
[33]
Generative contrastive learning for multi-label image classification
ShengWu Fu, ZhengShen Gu, Dong Wang, and Songhua Xu. Generative contrastive learning for multi-label image classification. InProceedings of the 2024 International Symposium on Digital Home (ISDH), pages 1–6. IEEE, 2024
work page 2024
-
[34]
B. Chaudhuri, B. Demir, S. Chaudhuri, and L. Bruzzone. Multilabel remote sensing image retrieval using a semisupervised graph-theoretic method.IEEE TGRS, 56(2):1144–1158, 2018
work page 2018
-
[35]
G. Sumbul and B. Demir. A novel graph-theoretic deep representation learning method for multi-label remote sensing image retrieval. InProceedings of the IEEE IGARSS, pages 266–269, 2021. 19 MACL: Multi-Label Adaptive Contrastive Learning Loss for Remote Sensing Image Retrieval
work page 2021
-
[36]
O. E. Dai, B. Demir, B. Sankur, and L. Bruzzone. A novel system for content based retrieval of multi-label remote sensing images. InProc. IEEE IGARSS, pages 1744–1747, 2017
work page 2017
-
[37]
Y . Hua, X. X. Zhu, and L. Mou. A pairwise label inference network for multi-label aerial image analysis.IEEE TGRS, 2021
work page 2021
-
[38]
Exploring contrastive learning for long-tailed multi-label text classification
Alexandre Audibert, Aurélien Gauffre, and Massih-Reza Amini. Exploring contrastive learning for long-tailed multi-label text classification. InEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD), pages 245–261, 2024
work page 2024
-
[39]
Rohit Gupta, Anirban Roy, Claire Christensen, Sujeong Kim, Sarah Gerard, Madeline Cincebeaux, Ajay Divakaran, Todd Grindal, and Mubarak Shah. Class prototypes based contrastive learning for classifying multi-label and fine-grained educational videos. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19923–19933, 2023
work page 2023
-
[40]
B. Chaudhuri, B. Demir, S. Chaudhuri, and L. Bruzzone. Multilabel remote sensing image retrieval using a semi supervised graph-theoretic model.IEEE TGRS, 56(2):1144–1158, 2018
work page 2018
-
[41]
Y . Hua, L. Mou, and X. X. Zhu. Relation network for multilabel aerial image classification.IEEE TGRS, 58(7):4554–4570, 2020
work page 2020
-
[42]
Yuan Cao, Xiangru Chen, Zifan Liu, Wenzhe Jia, Fanlei Meng, and Jie Gui. Deep continual hashing for real-world multi-label image retrieval.Expert Systems with Applications, 2023
work page 2023
-
[43]
A. Amir and E. Aptoula. A comparative study of multi-label supervised contrastive losses for the content-based image retrieval of remote sensing images. InProceedings of the 2025 33rd Signal Processing and Communications Applications Conference (SIU), pages 1–4. IEEE, 2025. 20
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.