Combined Hyperbolic and Euclidean Soft Triple Loss Beyond the Single Space Deep Metric Learning
Pith reviewed 2026-05-18 09:02 UTC · model grok-4.3
The pith
Combining hyperbolic and Euclidean soft triple losses improves deep metric learning accuracy and stability.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that the CHEST loss, consisting of proxy-based losses in hyperbolic and Euclidean spaces plus hyperbolic hierarchical clustering regularization, allows stable and effective use of proxy-based methods in hyperbolic geometry by leveraging the combination, resulting in higher accuracy and more stable learning than single-space methods.
What carries the argument
The CHEST loss, which merges soft triple losses from hyperbolic and Euclidean embedding spaces with a regularization term from hyperbolic hierarchical clustering.
If this is right
- The combination enhances DML accuracy and learning stability for both hyperbolic and Euclidean spaces.
- The method achieves new state-of-the-art performance on four benchmark datasets.
- Proxy-based losses, which have lower training complexity, become applicable in hyperbolic space through this hybrid approach.
Where Pith is reading between the lines
- This hybrid approach might be extended to other geometric spaces or loss functions in representation learning.
- Practitioners working with hierarchical data such as images in taxonomies could benefit from testing this on domain-specific datasets.
- Future experiments could vary the balance between the hyperbolic and Euclidean components to optimize for different data characteristics.
Load-bearing premise
The assumption that simply combining the losses from both spaces overcomes the unreported issues with proxy-based losses in hyperbolic space alone without creating new optimization problems.
What would settle it
Training results on the benchmark datasets where the CHEST loss does not outperform the pure Euclidean soft triple loss or shows decreased stability would falsify the central claim.
Figures
read the original abstract
Deep metric learning (DML) aims to learn a neural network mapping data to an embedding space, which can represent semantic similarity between data points. Hyperbolic space is attractive for DML since it can represent richer structures, such as tree structures. DML in hyperbolic space is based on pair-based loss or unsupervised regularization loss. On the other hand, supervised proxy-based losses in hyperbolic space have not been reported yet due to some issues in applying proxy-based losses in a hyperbolic space. However, proxy-based losses are attractive for large-scale datasets since they have less training complexity. To address these, this paper proposes the Combined Hyperbolic and Euclidean Soft Triple (CHEST) loss. CHEST loss is composed of the proxy-based losses in hyperbolic and Euclidean spaces and the regularization loss based on hyperbolic hierarchical clustering. We find that the combination of hyperbolic and Euclidean spaces improves DML accuracy and learning stability for both spaces. Finally, we evaluate the CHEST loss on four benchmark datasets, achieving a new state-of-the-art performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes the Combined Hyperbolic and Euclidean Soft Triple (CHEST) loss for deep metric learning. It combines proxy-based Soft Triple losses operating in hyperbolic and Euclidean spaces with an additional regularization term based on hyperbolic hierarchical clustering. The central claims are that this hybrid construction enables stable supervised proxy-based learning in hyperbolic geometry (where such losses have not previously been reported due to unspecified issues), that the combination improves both accuracy and training stability in each space individually, and that the method attains new state-of-the-art results on four standard DML benchmark datasets.
Significance. If the experimental claims are substantiated, the work would be significant because it supplies the first practical supervised proxy-based loss for hyperbolic DML. Proxy-based losses scale better than pair-based alternatives for large datasets; combining them with hyperbolic geometry could therefore improve both efficiency and the faithful embedding of hierarchical structure without sacrificing stability.
major comments (2)
- [Abstract] Abstract: the headline claim that 'the combination of hyperbolic and Euclidean spaces improves DML accuracy and learning stability for both spaces' is load-bearing yet unsupported by any isolation experiment. The abstract itself notes that supervised proxy-based losses 'have not been reported yet due to some issues' in hyperbolic space; the CHEST construction simply adds the Euclidean Soft Triple term and a hyperbolic regularizer. Without an ablation that trains the hyperbolic Soft Triple component alone (or with only the regularizer), it is impossible to determine whether the reported stability gain arises from genuine interaction or from the Euclidean term simply dominating the objective.
- [§4] §4 (Experimental results): the manuscript asserts new state-of-the-art performance on four benchmark datasets, yet the provided text supplies neither the full set of baselines, run-to-run variance, statistical significance tests, nor ablation tables that would be required to attribute gains specifically to the hybrid loss rather than to implementation details or hyper-parameter tuning.
minor comments (2)
- [Abstract] The abstract refers to 'some issues' with proxy-based losses in hyperbolic space without enumerating them; a brief enumeration (e.g., numerical instability of the hyperbolic distance or gradient explosion) would strengthen the motivation section.
- [§3] Notation for the combined loss (presumably Eq. (X) in §3) should explicitly separate the Euclidean Soft Triple term, the hyperbolic Soft Triple term, and the hierarchical clustering regularizer so that readers can see the weighting coefficients at a glance.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive report. We address the two major comments point by point below and commit to revisions that will strengthen the experimental support for our claims.
read point-by-point responses
-
Referee: [Abstract] Abstract: the headline claim that 'the combination of hyperbolic and Euclidean spaces improves DML accuracy and learning stability for both spaces' is load-bearing yet unsupported by any isolation experiment. The abstract itself notes that supervised proxy-based losses 'have not been reported yet due to some issues' in hyperbolic space; the CHEST construction simply adds the Euclidean Soft Triple term and a hyperbolic regularizer. Without an ablation that trains the hyperbolic Soft Triple component alone (or with only the regularizer), it is impossible to determine whether the reported stability gain arises from genuine interaction or from the Euclidean term simply dominating the objective.
Authors: We agree that isolating the contribution of each term is important for substantiating the headline claim. As noted in the manuscript, direct application of supervised proxy-based losses in hyperbolic space has not been reported due to training instabilities that we also observed in preliminary experiments (divergence or collapse when optimizing the hyperbolic Soft Triple term alone). The CHEST construction was developed precisely to overcome these issues by leveraging the Euclidean term for stability while retaining hyperbolic geometry's structural advantages, augmented by the hierarchical clustering regularizer. In the revised manuscript we will add explicit ablation experiments that report (i) Euclidean Soft Triple alone, (ii) hyperbolic Soft Triple plus regularizer, and (iii) the full CHEST objective, together with training-curve stability metrics. These results will clarify the synergistic effect rather than simple domination by the Euclidean term. revision: yes
-
Referee: [§4] §4 (Experimental results): the manuscript asserts new state-of-the-art performance on four benchmark datasets, yet the provided text supplies neither the full set of baselines, run-to-run variance, statistical significance tests, nor ablation tables that would be required to attribute gains specifically to the hybrid loss rather than to implementation details or hyper-parameter tuning.
Authors: We acknowledge that the current experimental section would benefit from more rigorous reporting to support the SOTA claims. The revised manuscript will expand §4 to include the complete set of baselines with citations, results averaged over multiple random seeds with standard deviations, statistical significance tests (e.g., paired t-tests or Wilcoxon tests against the strongest baseline), and detailed ablation tables that isolate each component of CHEST. These additions will allow readers to attribute performance gains specifically to the hybrid loss rather than implementation or tuning artifacts. revision: yes
Circularity Check
No significant circularity; empirical proposal with independent evaluation
full rationale
The paper introduces the CHEST loss as a novel combination of proxy-based terms in hyperbolic and Euclidean spaces plus a hyperbolic hierarchical clustering regularizer. The abstract and provided text contain no equations, derivations, or self-citations that reduce any claimed prediction or stability improvement to a fitted parameter, self-definition, or prior author result by construction. The central findings rest on experimental evaluation across four benchmark datasets rather than a closed mathematical chain that loops back to its inputs. This is the expected honest outcome for an applied loss-function paper whose value is demonstrated through external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Hyperbolic space can represent richer structures such as tree structures for DML.
- domain assumption Proxy-based losses have not been reported in hyperbolic space due to some issues.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
CHEST loss is composed of the proxy-based losses in hyperbolic and Euclidean spaces and the regularization loss based on hyperbolic hierarchical clustering.
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We find that the combination of hyperbolic and Euclidean spaces improves DML accuracy and learning stability for both spaces.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Q. Qian, L. Shang, B. Sun, J. Hu, T. Tacoma, H. Li, R. Jin, Softtriple loss: Deep metric learning without triplet sampling, in: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 6449–6457. doi:10.1109/ICCV.2019.00655
-
[2]
X. Wang, X. Han, W. Huang, D. Dong, M. R. Scott, Multi-similarity loss with general pair weighting for deep metric learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 5022–5030
work page 2019
-
[3]
S. Kim, D. Kim, M. Cho, S. Kwak, Proxy anchor loss for deep metric learning, in: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020
work page 2020
- [4]
-
[5]
O. Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, D. Wierstra, Matching networks for one shot learning, in: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, Curran Associates Inc., Red Hook, NY , USA, 2016, p. 3637–3645
work page 2016
-
[6]
F. Schroff, D. Kalenichenko, J. Philbin, Facenet: A unified embedding for face recognition and clustering, in: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 815–823.doi:10. 1109/CVPR.2015.7298682
-
[7]
W. Liu, Y . Wen, Z. Yu, M. Li, B. Raj, L. Song, Sphereface: Deep hypersphere embedding for face recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
work page 2017
- [8]
-
[9]
C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie, The Caltech-UCSD Birds-200-2011 Dataset, Tech. Rep. CNS-TR-2011-001, California Institute of Technology (2011)
work page 2011
-
[10]
Z. Liu, P. Luo, S. Qiu, X. Wang, X. Tang, Deepfashion: Powering robust clothes recognition and retrieval with rich annotations, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016
work page 2016
-
[11]
Sohn, Improved deep metric learning with multi-class n-pair loss objective, in: D
K. Sohn, Improved deep metric learning with multi-class n-pair loss objective, in: D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, R. Garnett (Eds.), Advances in Neural Information Processing Systems, V ol. 29, Curran Associates, Inc., 2016. URLhttps://proceedings.neurips.cc/paper/2016/file/6b180037abbebea991d8b1232f8a8ca9-Paper. pdf
work page 2016
-
[12]
R. Hadsell, S. Chopra, Y . LeCun, Dimensionality reduction by learning an invariant mapping, in: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - V olume 2, CVPR ’06, IEEE Computer Society, USA, 2006, p. 1735–1742. URLhttps://doi.org/10.1109/CVPR.2006.100
-
[13]
H. O. Song, Y . Xiang, S. Jegelka, S. Savarese, Deep metric learning via lifted structured feature embedding., CoRR abs/1511.06452 (2015). URLhttp://dblp.uni-trier.de/db/journals/corr/corr1511.html#SongXJS15
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[14]
Y . Movshovitz-Attias, A. Toshev, T. K. Leung, S. Ioffe, S. Singh, No fuss distance metric learning using proxies, in: 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 360–368.doi:10.1109/ICCV. 2017.47
-
[15]
A. Ermolov, L. Mirvakhabova, V . Khrulkov, N. Sebe, I. Oseledets, Hyperbolic vision transformers: Combining improvements in metric learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 7409–7419
work page 2022
-
[16]
M. Nickel, D. Kiela, Poincar ´e embeddings for learning hierarchical representations, in: I. Guyon, U. V . Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (Eds.), Advances in Neural Information Processing Systems, V ol. 30, Curran Associates, Inc., 2017. URLhttps://proceedings.neurips.cc/paper_files/paper/2017/file/ 59dfa2df42d9e3d4...
work page 2017
-
[17]
S. Kim, B. Jeong, S. Kwak, Hier: Metric learning beyond class labels via hierarchical regularization, in: Pro- ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2023. 11 CHEST lossA PREPRINT
work page 2023
-
[18]
O. Ganea, G. Becigneul, T. Hofmann, Hyperbolic neural networks, in: S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, R. Garnett (Eds.), Advances in Neural Information Processing Systems, V ol. 31, Curran Associates, Inc., 2018. URLhttps://proceedings.neurips.cc/paper_files/paper/2018/file/ dbab2adc8f9d078009ee3fa810bea142-Paper.pdf
work page 2018
-
[19]
P. Fang, M. Harandi, L. Petersson, Kernel methods in hyperbolic spaces, in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 10665–10674
work page 2021
- [20]
-
[21]
V . Khrulkov, L. Mirvakhabova, E. Ustinova, I. Oseledets, V . Lempitsky, Hyperbolic image embeddings, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020
work page 2020
-
[22]
C. Mao, A. Gupta, V . Nitin, B. Ray, S. Song, J. Yang, C. V ondrick, Multitask learning strengthens adversarial robustness, in: A. Vedaldi, H. Bischof, T. Brox, J.-M. Frahm (Eds.), Computer Vision – ECCV 2020, Springer International Publishing, Cham, 2020, pp. 158–174
work page 2020
-
[23]
J. Wei, M. Bosma, V . Zhao, K. Guu, A. W. Yu, B. Lester, N. Du, A. M. Dai, Q. V . Le, Finetuned language models are zero-shot learners, in: International Conference on Learning Representations, 2022. URLhttps://openreview.net/forum?id=gEZrGCozdqR
work page 2022
-
[24]
P. L. Bartlett, S. Mendelson, Rademacher and gaussian complexities: risk bounds and structural results, J. Mach. Learn. Res. 3 (null) (2003) 463–482
work page 2003
-
[25]
D. Yin, R. Kannan, P. Bartlett, Rademacher complexity for adversarially robust generalization, in: K. Chaudhuri, R. Salakhutdinov (Eds.), Proceedings of the 36th International Conference on Machine Learning, V ol. 97 of Proceedings of Machine Learning Research, PMLR, 2019, pp. 7085–7094. URLhttps://proceedings.mlr.press/v97/yin19b.html
work page 2019
-
[26]
I. Chami, A. Gu, V . Chatziafratis, C. R ´e, From trees to continuous embeddings and back: Hyperbolic hierar- chical clustering, in: H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, H. Lin (Eds.), Advances in Neural Information Processing Systems, V ol. 33, Curran Associates, Inc., 2020, pp. 15065–15076. URLhttps://proceedings.neurips.cc/paper/2020/fi...
work page 2020
-
[27]
J. Krause, M. Stark, J. Deng, L. Fei-Fei, 3d object representations for fine-grained categorization, in: Proceed- ings of the 2013 IEEE International Conference on Computer Vision Workshops, ICCVW ’13, IEEE Computer Society, USA, 2013, p. 554–561. URLhttps://doi.org/10.1109/ICCVW.2013.77
-
[28]
K. Musgrave, S. Belongie, S.-N. Lim, A metric learning reality check, in: A. Vedaldi, H. Bischof, T. Brox, J.-M. Frahm (Eds.), Computer Vision – ECCV 2020, Springer International Publishing, Cham, 2020, pp. 681–699
work page 2020
-
[29]
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby, An image is worth 16x16 words: Transformers for image recog- nition at scale, in: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, OpenReview....
work page 2021
-
[30]
T. Ridnik, E. Ben-Baruch, A. Noy, L. Zelnik, Imagenet-21k pretraining for the masses, in: J. Vanschoren, S. Yeung (Eds.), Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, V ol. 1, 2021. URLhttps://datasets-benchmarks-proceedings.neurips.cc/paper_files/paper/2021/file/ 98f13708210194c475687be6106a3b84-Paper-round1.pdf
work page 2021
-
[31]
L. Ren, C. Chen, L. Wang, K. A. Hua, Learning semantic proxies from visual prompts for parameter-efficient fine-tuning in deep metric learning, in: The Twelfth International Conference on Learning Representations, 2024. URLhttps://openreview.net/forum?id=TWVMVPx2wO
work page 2024
- [32]
-
[33]
I. Loshchilov, F. Hutter, Decoupled weight decay regularization, in: International Conference on Learning Rep- resentations, 2019. URLhttps://openreview.net/forum?id=Bkg6RiCqY7 12
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.