pith. sign in

arxiv: 2510.05643 · v2 · submitted 2025-10-07 · 💻 cs.CV

Combined Hyperbolic and Euclidean Soft Triple Loss Beyond the Single Space Deep Metric Learning

Pith reviewed 2026-05-18 09:02 UTC · model grok-4.3

classification 💻 cs.CV
keywords deep metric learninghyperbolic embeddingsEuclidean embeddingssoft triple lossproxy-based losshierarchical clusteringembedding space
0
0 comments X

The pith

Combining hyperbolic and Euclidean soft triple losses improves deep metric learning accuracy and stability.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the CHEST loss to combine supervised proxy-based losses in both hyperbolic and Euclidean spaces for deep metric learning. Hyperbolic space can better capture hierarchical structures but has had issues with proxy-based losses, which are efficient for large datasets. By pairing them with Euclidean losses and adding hyperbolic hierarchical clustering regularization, the combination is shown to boost performance and stability in both spaces. This matters because it makes advanced embedding techniques practical for big data while achieving better semantic similarity representations.

Core claim

The authors claim that the CHEST loss, consisting of proxy-based losses in hyperbolic and Euclidean spaces plus hyperbolic hierarchical clustering regularization, allows stable and effective use of proxy-based methods in hyperbolic geometry by leveraging the combination, resulting in higher accuracy and more stable learning than single-space methods.

What carries the argument

The CHEST loss, which merges soft triple losses from hyperbolic and Euclidean embedding spaces with a regularization term from hyperbolic hierarchical clustering.

If this is right

  • The combination enhances DML accuracy and learning stability for both hyperbolic and Euclidean spaces.
  • The method achieves new state-of-the-art performance on four benchmark datasets.
  • Proxy-based losses, which have lower training complexity, become applicable in hyperbolic space through this hybrid approach.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This hybrid approach might be extended to other geometric spaces or loss functions in representation learning.
  • Practitioners working with hierarchical data such as images in taxonomies could benefit from testing this on domain-specific datasets.
  • Future experiments could vary the balance between the hyperbolic and Euclidean components to optimize for different data characteristics.

Load-bearing premise

The assumption that simply combining the losses from both spaces overcomes the unreported issues with proxy-based losses in hyperbolic space alone without creating new optimization problems.

What would settle it

Training results on the benchmark datasets where the CHEST loss does not outperform the pure Euclidean soft triple loss or shows decreased stability would falsify the central claim.

Figures

Figures reproduced from arXiv: 2510.05643 by Hirohisa Aman, Minoru Kawahara, Shozo Saeki.

Figure 1
Figure 1. Figure 1: Network architecture for the CHEST loss. [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Distribution of embeddings on CUB200 train and test datasets and examples of retrieval results on the [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The histogram of similarities between batch data and classes. The left histogram result is without Euclidean [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
read the original abstract

Deep metric learning (DML) aims to learn a neural network mapping data to an embedding space, which can represent semantic similarity between data points. Hyperbolic space is attractive for DML since it can represent richer structures, such as tree structures. DML in hyperbolic space is based on pair-based loss or unsupervised regularization loss. On the other hand, supervised proxy-based losses in hyperbolic space have not been reported yet due to some issues in applying proxy-based losses in a hyperbolic space. However, proxy-based losses are attractive for large-scale datasets since they have less training complexity. To address these, this paper proposes the Combined Hyperbolic and Euclidean Soft Triple (CHEST) loss. CHEST loss is composed of the proxy-based losses in hyperbolic and Euclidean spaces and the regularization loss based on hyperbolic hierarchical clustering. We find that the combination of hyperbolic and Euclidean spaces improves DML accuracy and learning stability for both spaces. Finally, we evaluate the CHEST loss on four benchmark datasets, achieving a new state-of-the-art performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes the Combined Hyperbolic and Euclidean Soft Triple (CHEST) loss for deep metric learning. It combines proxy-based Soft Triple losses operating in hyperbolic and Euclidean spaces with an additional regularization term based on hyperbolic hierarchical clustering. The central claims are that this hybrid construction enables stable supervised proxy-based learning in hyperbolic geometry (where such losses have not previously been reported due to unspecified issues), that the combination improves both accuracy and training stability in each space individually, and that the method attains new state-of-the-art results on four standard DML benchmark datasets.

Significance. If the experimental claims are substantiated, the work would be significant because it supplies the first practical supervised proxy-based loss for hyperbolic DML. Proxy-based losses scale better than pair-based alternatives for large datasets; combining them with hyperbolic geometry could therefore improve both efficiency and the faithful embedding of hierarchical structure without sacrificing stability.

major comments (2)
  1. [Abstract] Abstract: the headline claim that 'the combination of hyperbolic and Euclidean spaces improves DML accuracy and learning stability for both spaces' is load-bearing yet unsupported by any isolation experiment. The abstract itself notes that supervised proxy-based losses 'have not been reported yet due to some issues' in hyperbolic space; the CHEST construction simply adds the Euclidean Soft Triple term and a hyperbolic regularizer. Without an ablation that trains the hyperbolic Soft Triple component alone (or with only the regularizer), it is impossible to determine whether the reported stability gain arises from genuine interaction or from the Euclidean term simply dominating the objective.
  2. [§4] §4 (Experimental results): the manuscript asserts new state-of-the-art performance on four benchmark datasets, yet the provided text supplies neither the full set of baselines, run-to-run variance, statistical significance tests, nor ablation tables that would be required to attribute gains specifically to the hybrid loss rather than to implementation details or hyper-parameter tuning.
minor comments (2)
  1. [Abstract] The abstract refers to 'some issues' with proxy-based losses in hyperbolic space without enumerating them; a brief enumeration (e.g., numerical instability of the hyperbolic distance or gradient explosion) would strengthen the motivation section.
  2. [§3] Notation for the combined loss (presumably Eq. (X) in §3) should explicitly separate the Euclidean Soft Triple term, the hyperbolic Soft Triple term, and the hierarchical clustering regularizer so that readers can see the weighting coefficients at a glance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive report. We address the two major comments point by point below and commit to revisions that will strengthen the experimental support for our claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the headline claim that 'the combination of hyperbolic and Euclidean spaces improves DML accuracy and learning stability for both spaces' is load-bearing yet unsupported by any isolation experiment. The abstract itself notes that supervised proxy-based losses 'have not been reported yet due to some issues' in hyperbolic space; the CHEST construction simply adds the Euclidean Soft Triple term and a hyperbolic regularizer. Without an ablation that trains the hyperbolic Soft Triple component alone (or with only the regularizer), it is impossible to determine whether the reported stability gain arises from genuine interaction or from the Euclidean term simply dominating the objective.

    Authors: We agree that isolating the contribution of each term is important for substantiating the headline claim. As noted in the manuscript, direct application of supervised proxy-based losses in hyperbolic space has not been reported due to training instabilities that we also observed in preliminary experiments (divergence or collapse when optimizing the hyperbolic Soft Triple term alone). The CHEST construction was developed precisely to overcome these issues by leveraging the Euclidean term for stability while retaining hyperbolic geometry's structural advantages, augmented by the hierarchical clustering regularizer. In the revised manuscript we will add explicit ablation experiments that report (i) Euclidean Soft Triple alone, (ii) hyperbolic Soft Triple plus regularizer, and (iii) the full CHEST objective, together with training-curve stability metrics. These results will clarify the synergistic effect rather than simple domination by the Euclidean term. revision: yes

  2. Referee: [§4] §4 (Experimental results): the manuscript asserts new state-of-the-art performance on four benchmark datasets, yet the provided text supplies neither the full set of baselines, run-to-run variance, statistical significance tests, nor ablation tables that would be required to attribute gains specifically to the hybrid loss rather than to implementation details or hyper-parameter tuning.

    Authors: We acknowledge that the current experimental section would benefit from more rigorous reporting to support the SOTA claims. The revised manuscript will expand §4 to include the complete set of baselines with citations, results averaged over multiple random seeds with standard deviations, statistical significance tests (e.g., paired t-tests or Wilcoxon tests against the strongest baseline), and detailed ablation tables that isolate each component of CHEST. These additions will allow readers to attribute performance gains specifically to the hybrid loss rather than implementation or tuning artifacts. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical proposal with independent evaluation

full rationale

The paper introduces the CHEST loss as a novel combination of proxy-based terms in hyperbolic and Euclidean spaces plus a hyperbolic hierarchical clustering regularizer. The abstract and provided text contain no equations, derivations, or self-citations that reduce any claimed prediction or stability improvement to a fitted parameter, self-definition, or prior author result by construction. The central findings rest on experimental evaluation across four benchmark datasets rather than a closed mathematical chain that loops back to its inputs. This is the expected honest outcome for an applied loss-function paper whose value is demonstrated through external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the domain assumption that hyperbolic geometry is suitable for hierarchical data and that a simple combination of spaces overcomes prior difficulties with proxy losses in hyperbolic space. No free parameters or invented entities are described in the abstract.

axioms (2)
  • domain assumption Hyperbolic space can represent richer structures such as tree structures for DML.
    Stated directly in the abstract as the reason hyperbolic space is attractive.
  • domain assumption Proxy-based losses have not been reported in hyperbolic space due to some issues.
    Abstract presents this as the motivation for the new combined approach.

pith-pipeline@v0.9.0 · 5712 in / 1342 out tokens · 30666 ms · 2026-05-18T09:02:57.706442+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 1 internal anchor

  1. [1]

    Q. Qian, L. Shang, B. Sun, J. Hu, T. Tacoma, H. Li, R. Jin, Softtriple loss: Deep metric learning without triplet sampling, in: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 6449–6457. doi:10.1109/ICCV.2019.00655

  2. [2]

    X. Wang, X. Han, W. Huang, D. Dong, M. R. Scott, Multi-similarity loss with general pair weighting for deep metric learning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 5022–5030

  3. [3]

    S. Kim, D. Kim, M. Cho, S. Kwak, Proxy anchor loss for deep metric learning, in: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020

  4. [4]

    Saeki, M

    S. Saeki, M. Kawahara, H. Aman, Multi proxy anchor family loss for several types of gradients, Computer Vision and Image Understanding 229 (2023) 103654

  5. [5]

    Vinyals, C

    O. Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, D. Wierstra, Matching networks for one shot learning, in: Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, Curran Associates Inc., Red Hook, NY , USA, 2016, p. 3637–3645

  6. [6]

    Schroff, D

    F. Schroff, D. Kalenichenko, J. Philbin, Facenet: A unified embedding for face recognition and clustering, in: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 815–823.doi:10. 1109/CVPR.2015.7298682

  7. [7]

    W. Liu, Y . Wen, Z. Yu, M. Li, B. Raj, L. Song, Sphereface: Deep hypersphere embedding for face recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017

  8. [8]

    Huang, H

    C. Huang, H. Guan, A. Jiang, Y . Zhang, M. Spratlin, Y . Wang, Registration based few-shot anomaly detection, in: European Conference on Computer Vision (ECCV), 2022

  9. [9]

    C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie, The Caltech-UCSD Birds-200-2011 Dataset, Tech. Rep. CNS-TR-2011-001, California Institute of Technology (2011)

  10. [10]

    Z. Liu, P. Luo, S. Qiu, X. Wang, X. Tang, Deepfashion: Powering robust clothes recognition and retrieval with rich annotations, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

  11. [11]

    Sohn, Improved deep metric learning with multi-class n-pair loss objective, in: D

    K. Sohn, Improved deep metric learning with multi-class n-pair loss objective, in: D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, R. Garnett (Eds.), Advances in Neural Information Processing Systems, V ol. 29, Curran Associates, Inc., 2016. URLhttps://proceedings.neurips.cc/paper/2016/file/6b180037abbebea991d8b1232f8a8ca9-Paper. pdf

  12. [12]

    Hadsell, S

    R. Hadsell, S. Chopra, Y . LeCun, Dimensionality reduction by learning an invariant mapping, in: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - V olume 2, CVPR ’06, IEEE Computer Society, USA, 2006, p. 1735–1742. URLhttps://doi.org/10.1109/CVPR.2006.100

  13. [13]

    H. O. Song, Y . Xiang, S. Jegelka, S. Savarese, Deep metric learning via lifted structured feature embedding., CoRR abs/1511.06452 (2015). URLhttp://dblp.uni-trier.de/db/journals/corr/corr1511.html#SongXJS15

  14. [14]

    Kendall, M

    Y . Movshovitz-Attias, A. Toshev, T. K. Leung, S. Ioffe, S. Singh, No fuss distance metric learning using proxies, in: 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 360–368.doi:10.1109/ICCV. 2017.47

  15. [15]

    Ermolov, L

    A. Ermolov, L. Mirvakhabova, V . Khrulkov, N. Sebe, I. Oseledets, Hyperbolic vision transformers: Combining improvements in metric learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 7409–7419

  16. [16]

    Nickel, D

    M. Nickel, D. Kiela, Poincar ´e embeddings for learning hierarchical representations, in: I. Guyon, U. V . Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (Eds.), Advances in Neural Information Processing Systems, V ol. 30, Curran Associates, Inc., 2017. URLhttps://proceedings.neurips.cc/paper_files/paper/2017/file/ 59dfa2df42d9e3d4...

  17. [17]

    S. Kim, B. Jeong, S. Kwak, Hier: Metric learning beyond class labels via hierarchical regularization, in: Pro- ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2023. 11 CHEST lossA PREPRINT

  18. [18]

    Ganea, G

    O. Ganea, G. Becigneul, T. Hofmann, Hyperbolic neural networks, in: S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, R. Garnett (Eds.), Advances in Neural Information Processing Systems, V ol. 31, Curran Associates, Inc., 2018. URLhttps://proceedings.neurips.cc/paper_files/paper/2018/file/ dbab2adc8f9d078009ee3fa810bea142-Paper.pdf

  19. [19]

    P. Fang, M. Harandi, L. Petersson, Kernel methods in hyperbolic spaces, in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 10665–10674

  20. [20]

    Z. Gao, Y . Wu, Y . Jia, M. Harandi, Curvature generation in curved spaces for few-shot learning, in: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 8671–8680.doi:10.1109/ ICCV48922.2021.00857

  21. [21]

    Khrulkov, L

    V . Khrulkov, L. Mirvakhabova, E. Ustinova, I. Oseledets, V . Lempitsky, Hyperbolic image embeddings, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020

  22. [22]

    C. Mao, A. Gupta, V . Nitin, B. Ray, S. Song, J. Yang, C. V ondrick, Multitask learning strengthens adversarial robustness, in: A. Vedaldi, H. Bischof, T. Brox, J.-M. Frahm (Eds.), Computer Vision – ECCV 2020, Springer International Publishing, Cham, 2020, pp. 158–174

  23. [23]

    J. Wei, M. Bosma, V . Zhao, K. Guu, A. W. Yu, B. Lester, N. Du, A. M. Dai, Q. V . Le, Finetuned language models are zero-shot learners, in: International Conference on Learning Representations, 2022. URLhttps://openreview.net/forum?id=gEZrGCozdqR

  24. [24]

    P. L. Bartlett, S. Mendelson, Rademacher and gaussian complexities: risk bounds and structural results, J. Mach. Learn. Res. 3 (null) (2003) 463–482

  25. [25]

    D. Yin, R. Kannan, P. Bartlett, Rademacher complexity for adversarially robust generalization, in: K. Chaudhuri, R. Salakhutdinov (Eds.), Proceedings of the 36th International Conference on Machine Learning, V ol. 97 of Proceedings of Machine Learning Research, PMLR, 2019, pp. 7085–7094. URLhttps://proceedings.mlr.press/v97/yin19b.html

  26. [26]

    Chami, A

    I. Chami, A. Gu, V . Chatziafratis, C. R ´e, From trees to continuous embeddings and back: Hyperbolic hierar- chical clustering, in: H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, H. Lin (Eds.), Advances in Neural Information Processing Systems, V ol. 33, Curran Associates, Inc., 2020, pp. 15065–15076. URLhttps://proceedings.neurips.cc/paper/2020/fi...

  27. [27]

    Krause, M

    J. Krause, M. Stark, J. Deng, L. Fei-Fei, 3d object representations for fine-grained categorization, in: Proceed- ings of the 2013 IEEE International Conference on Computer Vision Workshops, ICCVW ’13, IEEE Computer Society, USA, 2013, p. 554–561. URLhttps://doi.org/10.1109/ICCVW.2013.77

  28. [28]

    Musgrave, S

    K. Musgrave, S. Belongie, S.-N. Lim, A metric learning reality check, in: A. Vedaldi, H. Bischof, T. Brox, J.-M. Frahm (Eds.), Computer Vision – ECCV 2020, Springer International Publishing, Cham, 2020, pp. 681–699

  29. [29]

    Dosovitskiy, L

    A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby, An image is worth 16x16 words: Transformers for image recog- nition at scale, in: 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, OpenReview....

  30. [30]

    Ridnik, E

    T. Ridnik, E. Ben-Baruch, A. Noy, L. Zelnik, Imagenet-21k pretraining for the masses, in: J. Vanschoren, S. Yeung (Eds.), Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, V ol. 1, 2021. URLhttps://datasets-benchmarks-proceedings.neurips.cc/paper_files/paper/2021/file/ 98f13708210194c475687be6106a3b84-Paper-round1.pdf

  31. [31]

    L. Ren, C. Chen, L. Wang, K. A. Hua, Learning semantic proxies from visual prompts for parameter-efficient fine-tuning in deep metric learning, in: The Twelfth International Conference on Learning Representations, 2024. URLhttps://openreview.net/forum?id=TWVMVPx2wO

  32. [32]

    Patel, G

    Y . Patel, G. Tolias, J. Matas, Recall@k surrogate loss with large batches and similarity mixup, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 7502–7511

  33. [33]

    Loshchilov, F

    I. Loshchilov, F. Hutter, Decoupled weight decay regularization, in: International Conference on Learning Rep- resentations, 2019. URLhttps://openreview.net/forum?id=Bkg6RiCqY7 12