Mitigating the Hubness Problem for Zero-Shot Learning of 3D Objects
Pith reviewed 2026-05-24 21:38 UTC · model grok-4.3
The pith
A dedicated loss reduces the hubness problem in zero-shot learning for 3D point cloud objects.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The hubness problem, where a model is biased to predict only a few particular labels for most of the test instances, is even more severe for 3D recognition than for 2D recognition. One reason is that in 2D one can use pre-trained networks trained on large datasets like ImageNet, which produces high-quality features. However, in the 3D case there are no such large-scale, labelled datasets available for pre-training which means that the extracted 3D features are of poorer quality which, in turn, exacerbates the hubness problem. The authors therefore propose a loss to specifically address the hubness problem. Their method is effective for both Zero-Shot and Generalized Zero-Shot Learning, and a
What carries the argument
A loss term introduced to specifically address and mitigate the hubness problem in the prediction of 3D object labels.
If this is right
- The proposed loss improves performance in zero-shot learning of 3D objects.
- It also improves generalized zero-shot learning where both seen and unseen classes are tested.
- New state-of-the-art results are achieved on ModelNet40, ModelNet10, McGill and SHREC2015.
- Hubness bias is reduced without needing to change the feature extractor.
Where Pith is reading between the lines
- Future 3D feature extractors trained on larger data might reduce the need for such a corrective loss.
- The approach could extend to other 3D tasks like segmentation if hubness appears there.
- Real-world 3D scans with noise might test whether the loss still holds its effectiveness.
Load-bearing premise
The hubness problem stems primarily from poorer 3D features due to lack of large pre-training datasets and can be sufficiently corrected by adding one loss term without changing the feature extractor or introducing new biases.
What would settle it
If experiments show that the proposed loss does not reduce the hubness metric (such as the skewness of the label prediction distribution) or fails to increase accuracy on unseen classes in the tested 3D datasets, the central claim would be falsified.
Figures
read the original abstract
The development of advanced 3D sensors has enabled many objects to be captured in the wild at a large scale, and a 3D object recognition system may therefore encounter many objects for which the system has received no training. Zero-Shot Learning (ZSL) approaches can assist such systems in recognizing previously unseen objects. Applying ZSL to 3D point cloud objects is an emerging topic in the area of 3D vision, however, a significant problem that ZSL often suffers from is the so-called hubness problem, which is when a model is biased to predict only a few particular labels for most of the test instances. We observe that this hubness problem is even more severe for 3D recognition than for 2D recognition. One reason for this is that in 2D one can use pre-trained networks trained on large datasets like ImageNet, which produces high-quality features. However, in the 3D case there are no such large-scale, labelled datasets available for pre-training which means that the extracted 3D features are of poorer quality which, in turn, exacerbates the hubness problem. In this paper, we therefore propose a loss to specifically address the hubness problem. Our proposed method is effective for both Zero-Shot and Generalized Zero-Shot Learning, and we perform extensive evaluations on the challenging datasets ModelNet40, ModelNet10, McGill and SHREC2015. A new state-of-the-art result for both zero-shot tasks in the 3D case is established.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that the hubness problem is more severe in 3D zero-shot learning (ZSL) and generalized ZSL than in 2D due to poorer feature quality from the absence of large-scale pre-training datasets like ImageNet; it proposes a dedicated loss term to mitigate hubness, shows effectiveness on both ZSL and GZSL protocols, and reports new state-of-the-art results on ModelNet40, ModelNet10, McGill, and SHREC2015.
Significance. If the reported results hold, the contribution is a lightweight, empirically effective correction for hubness in 3D ZSL that does not require changing the feature extractor. The manuscript's internal consistency—standard splits, hubness quantification via established metrics, and separate ZSL/GZSL evaluations—supports the central claim without circular derivations or unstated assumptions that undermine the evaluation.
minor comments (4)
- [§3] §3 (method): the proposed loss is described at a high level; adding the explicit formula (including any weighting hyperparameter) would improve reproducibility.
- [Tables 1-2] Table 1 and Table 2: report the number of runs or standard deviations for the accuracy and hubness metrics to allow readers to assess stability of the SOTA claims.
- [§5.3] §5.3 (ablation): the text states the loss addresses hubness specifically, but an additional row showing performance when the loss is replaced by a generic regularizer would strengthen that attribution.
- [Figure 3] Figure 3: the t-SNE visualizations are useful but the caption should explicitly state the perplexity and whether the same random seed was used across methods.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our work and the recommendation for minor revision. We are pleased that the evaluation protocols, hubness quantification, and separation of ZSL/GZSL results were viewed as internally consistent and supportive of the central claims.
Circularity Check
No circularity; empirical loss proposed and evaluated on public benchmarks
full rationale
The paper presents an empirical method consisting of a dedicated loss term to address hubness in 3D ZSL/GZSL, built on top of existing feature extractors and evaluated using standard protocols on public datasets (ModelNet40, ModelNet10, McGill, SHREC2015). No equations, derivations, or parameter-fitting steps are described that reduce by construction to the paper's own inputs, predictions, or self-citations. The central claim remains independent of any self-definitional or fitted-input circularity, with hubness quantified via standard external metrics and results reported as SOTA under conventional splits.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Tensorflow: a system for large- scale machine learning
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. Tensorflow: a system for large- scale machine learning. In OSDI, volume 16, pages 265–283, 2016
work page 2016
-
[2]
Evaluation of output embeddings for fine-grained image classification
Zeynep Akata, Scott Reed, Daniel Walter, Honglak Lee, and Bernt Schiele. Evaluation of output embeddings for fine-grained image classification. InCVPR, pages 2927–2936, June 2015
work page 2015
-
[3]
Label-embedding for image classification
Zeynep Akata, Florent Perronnin, Zaid Harchaoui, and Cordelia Schmid. Label-embedding for image classification. TPAMI, 38(7):1425–1438, July 2016
work page 2016
-
[4]
Synthesized classifiers for zero- shot learning
Soravit Changpinyo, Wei-Lun Chao, Boqing Gong, and Fei Sha. Synthesized classifiers for zero- shot learning. In CVPR, pages 5327–5336, June 2016
work page 2016
-
[5]
An empirical study and analysis of generalized zero-shot learning for object recognition in the wild
Wei-Lun Chao, Boqing Changpinyo, Soravitand Gong, and Fei Sha. An empirical study and analysis of generalized zero-shot learning for object recognition in the wild. In ECCV, 2016
work page 2016
-
[6]
Chi Chen, Bisheng Yang, Shuang Song, Mao Tian, Jianping Li, Wenxia Dai, and Lina Fang. Calibrate multiple consumer rgb-d cameras for low-cost and efficient 3d indoor mapping.Remote Sensing, 10(2), 2018
work page 2018
-
[7]
3dcapsule: Extending the capsule architecture to classify 3d point clouds
Ali Cheraghian and Lars Petersson. 3dcapsule: Extending the capsule architecture to classify 3d point clouds. In WACV, pages 1194–1202, Jan 2019
work page 2019
-
[8]
Zero-shot learning of 3d point cloud objects
Ali Cheraghian, Shafin Rahman, and Lars Petersson. Zero-shot learning of 3d point cloud objects. In MVA, 2019
work page 2019
-
[9]
Attributes2classname: A discriminative model for attribute-based unsupervised zero-shot learning
Berkan Demirel, Ramazan Gokberk Cinbis, and Nazli Ikizler-Cinbis. Attributes2classname: A discriminative model for attribute-based unsupervised zero-shot learning. In ICCV, Oct 2017
work page 2017
-
[10]
Zero shot learning via multi-scale manifold regularization
Shay Deutsch, Soheil Kolouri, Kyungnam Kim, Yuri Owechko, and Stefano Soatto. Zero shot learning via multi-scale manifold regularization. In CVPR, July 2017
work page 2017
-
[11]
Improving zero-shot learning by mitigating the hubness problem
Georgiana Dinu and Marco Baroni. Improving zero-shot learning by mitigating the hubness problem. in ICLR workshop, 2014
work page 2014
-
[12]
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[13]
Kinectfusion: Real-time 3d reconstruction and interaction using a moving depth camera
Shahram Izadi, David Kim, Otmar Hilliges, David Molyneaux, Richard Newcombe, Pushmeet Kohli, Jamie Shotton, Steve Hodges, Dustin Freeman, Andrew Davison, and Andrew Fitzgibbon. Kinectfusion: Real-time 3d reconstruction and interaction using a moving depth camera. In A. CHERAGHIAN ET. AL: MITIGA TING THE HUBNESS PROBLEM 11 Proceedings of the 24th Annual AC...
work page 2011
-
[14]
Adam: A Method for Stochastic Optimization
Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[15]
Semantic autoencoder for zero-shot learning
Elyor Kodirov, Tao Xiang, and Shaogang Gong. Semantic autoencoder for zero-shot learning. In CVPR, July 2017
work page 2017
-
[16]
Lampert, Hannes Nickisch, and Stefan Harmeling
Christoph H. Lampert, Hannes Nickisch, and Stefan Harmeling. Learning to detect unseen object classes by between-class attribute transfer. In CVPR Workshops, pages 951–958, 2009
work page 2009
-
[17]
Lampert, Hannes Nickisch, and Stefan Harmeling
Christoph H. Lampert, Hannes Nickisch, and Stefan Harmeling. Attribute-based classification for zero-shot visual object categorization. TPAMI, pages 453–465, March 2014
work page 2014
-
[18]
Multi-label zero-shot learning with structured knowledge graphs
Chung-Wei Lee, Wei Fang, Chih-Kuan Yeh, and Yu-Chiang Frank Wang. Multi-label zero-shot learning with structured knowledge graphs. In CVPR, June 2018
work page 2018
-
[19]
Jiaxin Li, Ben M. Chen, and Gim Hee Lee. So-net: Self-organizing network for point cloud analysis. In CVPR, pages 9397–9406, 2018
work page 2018
-
[20]
Zero-shot recognition using dual visual-semantic mapping paths
Yanan Li, Donghui Wang, Huanhang Hu, Yuetan Lin, and Yueting Zhuang. Zero-shot recognition using dual visual-semantic mapping paths. In CVPR, July 2017
work page 2017
-
[21]
Z. Lian, J. Zhang, S. Choi, H. ElNaghy, J. El-Sana, T. Furuya, A. Giachetti, R. A. Guler, L. Lai, C. Li, H. Li, F. A. Limberger, R. Martin, R. U. Nakanishi, A. P. Neto, L. G. Nonato, R. Ohbuchi, K. Pevzner, D. Pickup, P. Rosin, A. Sharf, L. Sun, X. Sun, S. Tari, G. Unal, and R. C. Wilson. Non-rigid 3D Shape Retrieval. In I. Pratikakis, M. Spagnuolo, T. Th...
work page 2015
-
[22]
Improving semantic embedding consistency by metric learning for zero-shot classification
Stephane Herbin Maxime Bucher and Frederic Jurie. Improving semantic embedding consistency by metric learning for zero-shot classification. In ECCV, 2016
work page 2016
-
[23]
Distributed represen- tations of words and phrases and their compositionality
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed represen- tations of words and phrases and their compositionality. In NIPS, pages 3111–3119. 2013
work page 2013
-
[24]
Mark Palatucci, Dean Pomerleau, Geoffrey E. Hinton, and Tom M. Mitchell. Zero-shot learning with semantic output codes. In NIPS, pages 1410–1418, 2009
work page 2009
-
[25]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. Glove: Global vectors for word representation. In EMNLP, pages 1532–1543, 2014
work page 2014
-
[26]
Pointnet++: Deep hierarchical feature learning on point sets in a metric space
Charles R Qi, Li Yi, Hao Su, and Leonidas J Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In NIPS, pages 5099–5108, 2017
work page 2017
-
[27]
Charles Ruizhongtai Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In CVPR, 2017
work page 2017
-
[28]
Hubs in space: Popular nearest neighbors in high-dimensional data
Milos Radovanovic, Alexandros Nanopoulos, and Mirjana Ivanovic. Hubs in space: Popular nearest neighbors in high-dimensional data. JMLR, pages 2487–2531, 2010
work page 2010
-
[29]
Zero-shot object detection: Learning to simul- taneously recognize and localize novel concepts
Shafin Rahman, Salman Khan, and Fatih Porikli. Zero-shot object detection: Learning to simul- taneously recognize and localize novel concepts. In ACCV, December 2018
work page 2018
-
[30]
A unified approach for conventional zero-shot, generalized zero-shot, and few-shot learning
Shafin Rahman, Salman Khan, and Fatih Porikli. A unified approach for conventional zero-shot, generalized zero-shot, and few-shot learning. TIP, pages 5652–5667, Nov 2018. 12 A. CHERAGHIAN ET. AL: MITIGA TING THE HUBNESS PROBLEM
work page 2018
-
[31]
Deep0tag: Deep multiple instance learning for zero-shot image tagging
Shafin Rahman, Salman Khan, and Nick Barnes. Deep0tag: Deep multiple instance learning for zero-shot image tagging. IEEE Transactions on Multimedia, 2019
work page 2019
-
[32]
An embarrassingly simple approach to zero-shot learning
Bernardino Romera-Paredes and PHS Torr. An embarrassingly simple approach to zero-shot learning. In ICML, pages 2152–2161, 2015
work page 2015
-
[33]
Ridge re- gression, hubness, and zero-shot learning
Yutaro Shigeto, Ikumi Suzuki, Kazuo Hara, Masashi Shimbo, and Yuji Matsumoto. Ridge re- gression, hubness, and zero-shot learning. In ECMLPKDD, pages 135–151. Springer, 2015
work page 2015
-
[34]
Retrieving articulated 3-d models using medial surfaces
Kaleem Siddiqi, Juan Zhang, Diego Macrini, Ali Shokoufandeh, Sylvain Bouix, and Sven Dick- inson. Retrieving articulated 3-d models using medial surfaces. MVA, pages 261–275, May 2008
work page 2008
-
[35]
Accelerating t-sne using tree-based algorithms
Laurens Van Der Maaten. Accelerating t-sne using tree-based algorithms. JMLR, 15(1):3221– 3245, 2014
work page 2014
-
[36]
The Caltech- UCSD Birds-200-2011 Dataset
Catherine Wah, Steve Branson, Peter Welinder, Pietro Perona, and Serge Belongie. The Caltech- UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology, 2011
work page 2011
-
[37]
Local Spectral Graph Convolution for Point Set Feature Learning
Chu Wang, Babak Samari, and Kaleem Siddiqi. Local spectral graph convolution for point set feature learning. arXiv preprint arXiv:1803.05827, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[38]
Dynamic Graph CNN for Learning on Point Clouds
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, and Justin M. Solomon. Dynamic graph cnn for learning on point clouds. arXiv preprint arXiv:1801.07829 , 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[39]
Zhirong Wu, S. Song, A. Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and J. Xiao. 3d shapenets: A deep representation for volumetric shapes. In CVPR, pages 1912–1920, 2015
work page 1912
-
[40]
Latent embeddings for zero-shot classification
Yongqin Xian, Zeynep Akata, Gaurav Sharma, Quynh Nguyen, Matthias Hein, and Bernt Schiele. Latent embeddings for zero-shot classification. In CVPR, June 2016
work page 2016
-
[41]
Lampert, Bernt Schiele, and Zeynep Akata
Yongqin Xian, Christoph H. Lampert, Bernt Schiele, and Zeynep Akata. Zero-shot learning - a comprehensive evaluation of the good, the bad and the ugly. TPAMI, 2018
work page 2018
-
[42]
Qing-Song Xu and Yi-Zeng Liang. Monte carlo cross validation. Chemometrics and Intelligent Laboratory Systems, 56(1):1 – 11, 2001
work page 2001
-
[43]
SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters
Yifan Xu, Tianqi Fan, Mingye Xu, Long Zeng, and Yu Qiao. Spidercnn: Deep learning on point sets with parameterized convolutional filters.arXiv preprint arXiv:1803.11527, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[44]
Learning a deep embedding model for zero-shot learning
Li Zhang, Tao Xiang, and Shaogang Gong. Learning a deep embedding model for zero-shot learning. In CVPR, July 2017
work page 2017
-
[45]
Domain- invariant projection learning for zero-shot recognition
An Zhao, Mingyu Ding, Jiechao Guan, Zhiwu Lu, Tao Xiang, and Ji-Rong Wen. Domain- invariant projection learning for zero-shot recognition. In NIPS, 2018
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.