Person Re-Identification via Generalized Class Prototypes
Pith reviewed 2026-05-18 05:36 UTC · model grok-4.3
The pith
Selecting multiple non-centroid representations per class improves re-identification accuracy and mean average precision over standard centroid methods.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Prior centroid-based techniques for representing gallery classes during retrieval yield suboptimal re-identification metrics; a generalized selection method that chooses multiple representations per class, not restricted to centroids, produces consistent improvements in accuracy and mean average precision when applied on top of multiple existing embeddings, with the number of representations adjustable to meet specific requirements.
What carries the argument
Generalized class prototypes, which are any selected representations per class used at retrieval time instead of being limited to the single centroid of the class.
If this is right
- The number of representations kept per class can be tuned to trade accuracy against mean average precision according to deployment needs.
- The same selection method delivers gains when layered on top of several different re-identification embeddings.
- Improvements appear in both accuracy and mean average precision, moving results beyond those reported by contemporary methods.
- No retraining of the underlying feature extractor is required.
Where Pith is reading between the lines
- If storage is a constraint, practitioners could choose fewer than the full set of representations while still outperforming a single centroid.
- The approach might extend to other image retrieval tasks where class prototypes are currently computed as centroids.
- A natural next test would be whether the gains hold when the gallery set grows much larger than the training classes.
Load-bearing premise
That prior centroid-based techniques are suboptimal in re-identification metrics and that the proposed generalized selection will produce consistent gains without hidden costs such as increased storage or inference time.
What would settle it
A controlled experiment on a standard benchmark dataset such as Market-1501 that shows no gain or a drop in rank-1 accuracy and mAP when the generalized selection method replaces centroid prototypes on the same embeddings.
Figures
read the original abstract
Advanced feature extraction methods have significantly contributed to enhancing the task of person re-identification. In addition, modifications to objective functions have been developed to further improve performance. Nonetheless, selecting better class representatives is an underexplored area of research that can also lead to advancements in re-identification performance. Although past works have experimented with using the centroid of a gallery image class during training, only a few have investigated alternative representations during the retrieval stage. In this paper, we demonstrate that these prior techniques yield suboptimal results in terms of re-identification metrics. To address the re-identification problem, we propose a generalized selection method that involves choosing representations that are not limited to class centroids. Our approach strikes a balance between accuracy and mean average precision, leading to improvements beyond the state of the art. For example, the actual number of representations per class can be adjusted to meet specific application requirements. We apply our methodology on top of multiple re-identification embeddings, and in all cases it substantially improves upon contemporary results.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a generalized class prototype selection method for person re-identification that moves beyond class centroids during retrieval. It asserts that prior centroid-based techniques produce suboptimal accuracy and mAP, and claims that the new adjustable selection of non-centroid representations per class yields improvements over the state of the art when applied atop multiple existing embeddings, while allowing the number of prototypes to be tuned for specific application needs.
Significance. If the empirical claims are substantiated, the work could provide a lightweight, post-hoc way to boost re-id performance by optimizing gallery representations without retraining feature extractors or losses. The adjustability of prototype count per class offers practical flexibility. However, the current text supplies no quantitative results, ablations, or cost analysis, so significance cannot yet be evaluated.
major comments (3)
- [Abstract] Abstract: the assertion that 'these prior techniques yield suboptimal results in terms of re-identification metrics' is stated without any supporting numbers, tables, or citations to concrete accuracy/mAP values from the referenced centroid methods.
- [Method] The generalized selection method is described only at a high level; no concrete criterion (clustering, nearest-neighbor sampling, learned selection, etc.), algorithm, or pseudocode is given for choosing the non-centroid representations.
- [Experiments] No ablation or comparison is reported that isolates the gain from k>1 prototypes versus the k=1 centroid baseline while measuring the resulting increase in gallery size, storage, or retrieval latency.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below and outline the revisions we will make to strengthen the paper.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion that 'these prior techniques yield suboptimal results in terms of re-identification metrics' is stated without any supporting numbers, tables, or citations to concrete accuracy/mAP values from the referenced centroid methods.
Authors: We agree that the abstract would benefit from greater specificity. In the revised manuscript, we will add concrete quantitative examples drawn from our experimental results (e.g., specific accuracy and mAP deltas versus centroid baselines) and include citations to the prior centroid-based works with their reported metrics for direct comparison. revision: yes
-
Referee: [Method] The generalized selection method is described only at a high level; no concrete criterion (clustering, nearest-neighbor sampling, learned selection, etc.), algorithm, or pseudocode is given for choosing the non-centroid representations.
Authors: The full manuscript describes the generalized prototype selection as a per-class process that selects multiple representations based on intra-class similarity and diversity criteria. To make this fully explicit, we will insert a formal algorithm description together with pseudocode in the Method section of the revision. revision: yes
-
Referee: [Experiments] No ablation or comparison is reported that isolates the gain from k>1 prototypes versus the k=1 centroid baseline while measuring the resulting increase in gallery size, storage, or retrieval latency.
Authors: This is a fair observation. While our main results demonstrate overall gains when using multiple prototypes, we did not present a dedicated cost-benefit ablation. We will add a new table and accompanying analysis in the Experiments section that isolates performance improvements for varying numbers of prototypes against the corresponding increases in gallery size, storage, and query latency. revision: yes
Circularity Check
No derivation chain present; purely empirical method
full rationale
The paper describes an empirical proposal for generalized class prototype selection in person re-identification, claiming improvements over centroid baselines via experimental results on embeddings. No equations, first-principles derivations, or load-bearing mathematical steps are referenced in the provided abstract or summary. The central claims rest on benchmark performance gains rather than any reduction of outputs to fitted inputs or self-citations, rendering the work self-contained with no circularity.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose a generalized selection method that involves choosing representations that are not limited to class centroids... attention-based model... L = L_triplet + λ L_reg
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leanalpha_pin_under_high_calibration unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The number of prototypes per class is treated as a tunable hyperparameter... N=3 yields best R-1/mAP trade-off
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Chen, T., Ding, S., Xie, J., Yuan, Y ., Chen, W., Yang, Y ., Ren, Z., Wang, Z.: Abd-net: At- tentive but diverse person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 8351–8361 (2019)
work page 2019
-
[2]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Chen, W., Chen, X., Zhang, J., Huang, K.: Beyond triplet loss: A deep quadruplet network for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 403–412 (2017)
work page 2017
-
[3]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Chen, W., Xu, X., Jia, J., Luo, H., Wang, Y ., Wang, F., Jin, R., Sun, X.: Beyond appearance: A semantic controllable self-supervised learning framework for human-centric visual tasks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 15050–15061 (2023)
work page 2023
-
[4]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Chung, D., Tahboub, K., Delp, E.J.: A two stream siamese convolutional neural network for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 1983–1991 (2017)
work page 1983
-
[5]
IEEE Transactions on Image Processing6(9), 1305–1315 (1997)
Eldar, Y ., Lindenbaum, M., Porat, M., Zeevi, Y .Y .: The farthest point strategy for progressive image sampling. IEEE Transactions on Image Processing6(9), 1305–1315 (1997)
work page 1997
-
[6]
In: Pro- ceedings of the IEEE/CVF International Conference on Computer Vision
Ess, A., Leibe, B., Van Gool, L.: Depth and appearance for mobile scene analysis. In: Pro- ceedings of the IEEE/CVF International Conference on Computer Vision. pp. 1–8 (2007) 7.https://github.com/robotic-vision-lab/Person-Re- Identification-Via-Generalized-Class-Prototypes
work page 2007
-
[7]
Gray, D., Brennan, S., Tao, H.: Evaluating appearance models for recognition, reacquisition, and tracking. In: Proceedings of the IEEE International Workshop on Performance Evalua- tion for Tracking and Surveillance. vol. 3, pp. 1–7 (2007)
work page 2007
-
[8]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
He, S., Luo, H., Wang, P., Wang, F., Li, H., Jiang, W.: Transreid: Transformer-based object re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 15013–15022 (2021)
work page 2021
-
[9]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition
He, W., Deng, Y ., Tang, S., Chen, Q., Xie, Q., Wang, Y ., Bai, L., Zhu, F., Zhao, R., Ouyang, W., Qi, D., Yan, Y .: Instruct-reid: A multi-purpose person re-identification task with instruc- tions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition. pp. 17521–17531 (2024) Generalized Class Prototypes 15
work page 2024
-
[10]
Computational Geometry57, 1–7 (2016)
Kamousi, P., Lazard, S., Maheshwari, A., Wuhrer, S.: Analysis of farthest point sampling for approximating geodesics in a graph. Computational Geometry57, 1–7 (2016)
work page 2016
-
[11]
In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems
Lagunes-Fortiz, M., Damen, D., Mayol-Cuevas, W.: Centroids triplet network and temporally-consistent embeddings for in-situ object recognition. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. pp. 10796–10802 (2020)
work page 2020
-
[12]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Li, H., Wu, G., Zheng, W.S.: Combined depth space based architecture search for person re- identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 6729–6738 (2021)
work page 2021
-
[13]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Li, S., Bak, S., Carr, P., Wang, X.: Diversity regularized spatiotemporal attention for video- based person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 369–378 (2018)
work page 2018
-
[14]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Li, W., Zhao, R., Xiao, T., Wang, X.: Deepreid: Deep filter pairing neural network for per- son re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 152–159 (2014)
work page 2014
-
[15]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Li, Y ., He, J., Zhang, T., Liu, X., Zhang, Y ., Wu, F.: Diverse part discovery: Occluded person re-identification with part-aware transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 2898–2907 (2021)
work page 2021
-
[16]
IEEE Transactions on Image Processing26(7), 3492–3506 (2017)
Liu, H., Feng, J., Qi, M., Jiang, J., Yan, S.: End-to-end comparative attention networks for person re-identification. IEEE Transactions on Image Processing26(7), 3492–3506 (2017)
work page 2017
-
[17]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Liu, Y ., Yan, J., Ouyang, W.: Quality aware network for set to set recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5790–5799 (2017)
work page 2017
-
[18]
IEEE Transactions on Multimedia22(10), 2597–2609 (2019)
Luo, H., Jiang, W., Gu, Y ., Liu, F., Liao, X., Lai, S., Gu, J.: A strong baseline and batch nor- malization neck for deep person re-identification. IEEE Transactions on Multimedia22(10), 2597–2609 (2019)
work page 2019
-
[19]
Journal of Machine Learning Research9(11) (2008)
Van der Maaten, L., Hinton, G.: Visualizing data using t-sne. Journal of Machine Learning Research9(11) (2008)
work page 2008
-
[20]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Quan, R., Dong, X., Wu, Y ., Zhu, L., Yang, Y .: Auto-reid: Searching for a part-aware convnet for person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 3750–3759 (2019)
work page 2019
-
[21]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Si, J., Zhang, H., Li, C.G., Kuen, J., Kong, X., Kot, A.C., Wang, G.: Dual attention matching network for context-aware feature sequence based person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5363–5372 (2018)
work page 2018
-
[22]
In: Proceed- ings of the Advances in Neural Information Processing Systems
Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Proceed- ings of the Advances in Neural Information Processing Systems. vol. 30 (2017)
work page 2017
-
[23]
In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision
Somers, V ., De Vleeschouwer, C., Alahi, A.: Body part-based representation learning for occluded person re-identification. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. pp. 1613–1623 (2023)
work page 2023
-
[24]
IEEE Transactions on Circuits and Systems for Video Technol- ogy32(1), 160–171 (2021)
Tan, H., Liu, X., Bian, Y ., Wang, H., Yin, B.: Incomplete descriptor mining with elastic loss for person re-identification. IEEE Transactions on Circuits and Systems for Video Technol- ogy32(1), 160–171 (2021)
work page 2021
-
[25]
Springer science & business media (2013)
Vapnik, V .: The nature of statistical learning theory. Springer science & business media (2013)
work page 2013
-
[26]
In: Proceedings of the Advances in Neural Information Processing Systems
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polo- sukhin, I.: Attention is all you need. In: Proceedings of the Advances in Neural Information Processing Systems. vol. 30 (2017)
work page 2017
-
[27]
In: Proceed- ings of the AAAI Conference on Artificial Intelligence
Wang, G., Lai, J., Huang, P., Xie, X.: Spatial-temporal person re-identification. In: Proceed- ings of the AAAI Conference on Artificial Intelligence. vol. 33, pp. 8933–8940 (2019) 16 Md Ahmed Al Muzaddid and William J. Beksi
work page 2019
-
[28]
In: Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition
Wang, H., Shen, J., Liu, Y ., Gao, Y ., Gavves, E.: Nformer: Robust person re-identification with neighbor transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition. pp. 7297–7307 (2022)
work page 2022
-
[29]
In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
Wang, J., Wang, K.C., Law, M.T., Rudzicz, F., Brudno, M.: Centroid-based deep metric learning for speaker recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. pp. 3652–3656 (2019)
work page 2019
-
[30]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Wei, L., Zhang, S., Gao, W., Tian, Q.: Person transfer gan to bridge domain gap for person re- identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 79–88 (2018)
work page 2018
-
[31]
In: Proceedings of the European Conference on Computer Vision
Wen, Y ., Zhang, K., Li, Z., Qiao, Y .: A discriminative feature learning approach for deep face recognition. In: Proceedings of the European Conference on Computer Vision. pp. 499–515. Springer (2016)
work page 2016
-
[32]
In: Proceedings of the International Conference on Neural Information Processing
Wieczorek, M., Rychalska, B., Dabrowski, J.: On the unreasonable effectiveness of centroids in image retrieval. In: Proceedings of the International Conference on Neural Information Processing. pp. 212–223. Springer (2021)
work page 2021
-
[33]
In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops
Yuan, Y ., Chen, W., Yang, Y ., Wang, Z.: In defense of the triplet loss again: Learning robust person re-identification with fast approximated triplet loss and label distillation. In: Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. pp. 354–355 (2020)
work page 2020
-
[34]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Zhang, A., Gao, Y ., Niu, Y ., Liu, W., Zhou, Y .: Coarse-to-fine person re-identification with auxiliary-domain classification and second-order information bottleneck. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 598–607 (2021)
work page 2021
-
[35]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Zhang, G., Zhang, Y ., Zhang, T., Li, B., Pu, S.: Pha: Patch-wise high-frequency augmentation for transformer-based person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 14133–14142 (2023)
work page 2023
-
[36]
IEEE Transactions on Multimedia24, 4158–4169 (2021)
Zhang, Z., Lan, C., Zeng, W., Chen, Z., Chang, S.F.: Beyond triplet loss: Meta prototypical n-tuple loss for person re-identification. IEEE Transactions on Multimedia24, 4158–4169 (2021)
work page 2021
-
[37]
In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition
Zheng, F., Deng, C., Sun, X., Jiang, X., Guo, X., Yu, Z., Huang, F., Ji, R.: Pyramidal person re-identification via multi-loss dynamic training. In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition. pp. 8514–8522 (2019)
work page 2019
-
[38]
In: Proceedings of the IEEE/CVF International Conference on Computer Vi- sion
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: A benchmark. In: Proceedings of the IEEE/CVF International Conference on Computer Vi- sion. pp. 1116–1124 (2015)
work page 2015
-
[39]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Zhong, Z., Zheng, L., Cao, D., Li, S.: Re-ranking person re-identification with k-reciprocal encoding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 1318–1327 (2017)
work page 2017
-
[40]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Zhong, Z., Zheng, L., Zheng, Z., Li, S., Yang, Y .: Camera style adaptation for person re- identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5157–5166 (2018)
work page 2018
-
[41]
In: Proceedings of the IEEE/CVF International Conference on Computer Vi- sion
Zhou, K., Yang, Y ., Cavallaro, A., Xiang, T.: Omni-scale feature learning for person re- identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vi- sion. pp. 3702–3712 (2019)
work page 2019
-
[42]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Zhou, X., Zhong, Y ., Cheng, Z., Liang, F., Ma, L.: Adaptive sparse pairwise loss for object re- identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 19691–19701 (2023)
work page 2023
-
[43]
Zhu, Y ., Yang, Z., Wang, L., Zhao, S., Hu, X., Tao, D.: Hetero-center loss for cross-modality person re-identification. Neurocomputing386, 97–109 (2020) Generalized Class Prototypes 17 Appendix In this appendix we provide additional details on the GCP methodology. A Rationale Behind the Effectiveness of GCP To demonstrate how GCP enhances Re-ID performan...
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.