Recognition: 2 theorem links
· Lean TheoremHead Similarity: Modeling Structured Whole-Head Appearance Beyond Face Recognition
Pith reviewed 2026-05-11 02:51 UTC · model grok-4.3
The pith
Head Similarity extends face recognition to model structured whole-head appearance variations including hairstyle and styling changes.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Head Similarity is a formulation that extends identity-centric recognition to structured whole-head similarity modeling by capturing intra-identity appearance variation and enforcing hierarchical similarity ordering across identity and appearance states, demonstrated feasible via a framework using hierarchical supervision and identity-aware distillation on a video-derived benchmark.
What carries the argument
The Head Similarity formulation, which explicitly captures intra-identity appearance variation and enforces hierarchical similarity ordering across identity and appearance states.
If this is right
- Meaningful similarity comparisons remain possible even under occlusion or rear-view conditions where facial cues are absent.
- Conventional face recognition models are shown to fail at capturing appearance-dependent similarity.
- Applications requiring identity consistency beyond strict biometric recognition can use whole-head cues.
- A large-scale benchmark from long-form videos enables training for diverse poses and temporal changes.
Where Pith is reading between the lines
- Such models could improve person re-identification in videos with frequent appearance changes.
- Embedding spaces might need to represent multiple appearance states per identity rather than single points.
- Future work could test generalization to real-world surveillance footage without video-based weak labels.
Load-bearing premise
A large-scale benchmark from long-form videos with weakly-supervised appearance states sufficiently captures diverse poses, occlusions, and temporal changes to train effective models.
What would settle it
A standard face recognition model trained on the same benchmark achieves comparable accuracy on tasks measuring appearance-dependent similarity and hierarchical ordering as the proposed Head Similarity framework.
Figures
read the original abstract
Many vision applications require identity consistency beyond strict biometric recognition, especially under non-frontal views or when facial cues are missing. However, conventional face recognition models enforce intra-identity invariance, collapsing appearance variations such as hairstyle or styling changes into a single representation, limiting their use in appearance-sensitive scenarios. To address this limitation, we introduce Head Similarity, a new formulation that extends identity-centric recognition to structured whole-head similarity modeling. Our approach explicitly captures intra-identity appearance variation and enforces hierarchical similarity ordering across identity and appearance states, enabling meaningful comparison even under occlusion or rear-view conditions. We construct a large-scale benchmark from long-form videos with weakly-supervised appearance states, covering diverse poses, occlusions, and temporal changes. As a first step, we develop a simple yet effective framework that jointly models identity discrimination and appearance-sensitive similarity through hierarchical supervision and identity-aware distillation. Experiments show that conventional face recognition models fail to capture appearance-dependent similarity, while our approach demonstrates the feasibility of structured whole-head similarity modeling.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Head Similarity as a new formulation extending identity-centric face recognition to structured whole-head similarity modeling that explicitly captures intra-identity appearance variations (e.g., hairstyle, styling, occlusion). It constructs a large-scale benchmark from long-form videos using weakly-supervised appearance state labels and proposes a simple framework combining hierarchical supervision with identity-aware distillation. Experiments are presented to show that conventional face recognition collapses appearance variation while the proposed approach demonstrates feasibility of appearance-dependent similarity under diverse poses and views.
Significance. If the central claims hold after addressing validation gaps, the work could meaningfully advance computer vision applications needing nuanced identity consistency beyond biometrics, such as video-based re-identification or non-frontal analysis. The benchmark and hierarchical supervision idea provide a concrete starting point for future research on appearance-sensitive modeling. Credit is due for framing the problem clearly and releasing a new data resource, though the significance is tempered by the absence of label-quality diagnostics that would allow readers to trust the reported gaps versus baselines.
major comments (2)
- [§3] §3 (Benchmark Construction): The weakly-supervised appearance state labels extracted from long-form videos are load-bearing for the hierarchical similarity ordering and all downstream claims, yet the section provides no quantitative validation (e.g., label accuracy vs. manual annotation, inter-state consistency under pose variation, or noise-robustness checks). Without such evidence, it remains possible that performance differences versus face-recognition baselines arise from label artifacts rather than the modeling approach.
- [§5] §5 (Experiments): The claim that conventional face recognition models fail to capture appearance-dependent similarity while the proposed method succeeds is central, but the reported results lack concrete metrics, error bars, statistical significance tests, or ablation isolating the contribution of hierarchical supervision versus identity-aware distillation. This makes it difficult to evaluate whether the feasibility demonstration is robust.
minor comments (2)
- [Abstract] Abstract: The high-level description of the benchmark and framework is clear, but adding one sentence on dataset scale (number of identities, videos, and appearance states) would help readers gauge its coverage of pose/occlusion diversity.
- [Method] Notation: The distinction between identity discrimination loss and appearance-sensitive similarity loss could be clarified with a short equation or diagram in the method section to avoid ambiguity for readers unfamiliar with distillation setups.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback, which has helped us identify areas to strengthen the manuscript. We address each major comment below and commit to revisions that directly respond to the concerns raised.
read point-by-point responses
-
Referee: [§3] §3 (Benchmark Construction): The weakly-supervised appearance state labels extracted from long-form videos are load-bearing for the hierarchical similarity ordering and all downstream claims, yet the section provides no quantitative validation (e.g., label accuracy vs. manual annotation, inter-state consistency under pose variation, or noise-robustness checks). Without such evidence, it remains possible that performance differences versus face-recognition baselines arise from label artifacts rather than the modeling approach.
Authors: We acknowledge that explicit validation of the weakly-supervised labels is essential for establishing trust in the benchmark. The labels are generated via a temporal consistency and clustering pipeline applied to long-form video tracks, but the current manuscript does not include quantitative diagnostics. In the revised version, we will add a new subsection under §3 that reports: (i) agreement metrics (accuracy, Cohen’s kappa) on a manually annotated subset of 1,000 randomly sampled tracks stratified by pose and occlusion; (ii) inter-state consistency analysis by computing intra- and inter-state similarity distributions under frontal vs. non-frontal views; and (iii) a noise-robustness check by injecting controlled label flips and re-running key experiments. These additions will allow readers to assess whether performance gaps reflect modeling improvements rather than label artifacts. revision: yes
-
Referee: [§5] §5 (Experiments): The claim that conventional face recognition models fail to capture appearance-dependent similarity while the proposed method succeeds is central, but the reported results lack concrete metrics, error bars, statistical significance tests, or ablation isolating the contribution of hierarchical supervision versus identity-aware distillation. This makes it difficult to evaluate whether the feasibility demonstration is robust.
Authors: We agree that the experimental presentation requires greater rigor to support the central claims. In the revision we will: (1) report all similarity metrics with error bars computed over five independent training runs using different random seeds; (2) include paired statistical significance tests (e.g., t-tests with p-values) comparing our method against each baseline; (3) add a dedicated ablation table that isolates hierarchical supervision (by removing the appearance-state ordering loss) and identity-aware distillation (by removing the distillation term) while keeping all other components fixed; and (4) expand the metric suite to include mean average precision and rank-1 accuracy in addition to the current similarity scores. These changes will make the feasibility demonstration more robust and reproducible. revision: yes
Circularity Check
No circularity: new formulation and framework with no derivations or self-referential reductions
full rationale
The paper introduces Head Similarity as a new formulation extending face recognition to structured whole-head modeling, constructs a benchmark from long-form videos using weakly-supervised appearance states, and proposes a framework with hierarchical supervision plus identity-aware distillation. No equations, parameter fittings, predictions, or derivations are present in the abstract or described approach. No self-citations are used to justify uniqueness theorems, ansatzes, or load-bearing premises. The central claim is a feasibility demonstration via experiments comparing to conventional models, which remains independent of any input reduction or self-definition. This qualifies as a self-contained new-task proposal with no circular steps.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Head Similarity requires the hierarchical ordering sθ(xi,xj)>sθ(xi,xk)>sθ(xi,xℓ) for (i,j)∈R1, (i,k)∈R2, (i,ℓ)∈R3
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We adopt a dual-CLS Vision Transformer... Lsim = Softplus(m1 + san1 − sap) + ...
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Partial fc: Training 10 million identities on a single machine
Xiang An, Xuhan Zhu, Yuan Gao, Yang Xiao, Yongle Zhao, Ziyong Feng, Lan Wu, Bin Qin, Ming Zhang, Debing Zhang, et al. Partial fc: Training 10 million identities on a single machine. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1445--1449, 2021
work page 2021
-
[2]
Vggface2: A dataset for recognising faces across pose and age
Qiong Cao, Li Shen, Weidi Xie, Omkar M Parkhi, and Andrew Zisserman. Vggface2: A dataset for recognising faces across pose and age. In 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018), pages 67--74. IEEE, 2018
work page 2018
-
[3]
Hairnerf: Geometry-aware image synthesis for hairstyle transfer
Seunggyu Chang, Gihoon Kim, and Hayeon Kim. Hairnerf: Geometry-aware image synthesis for hairstyle transfer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2448--2458, 2023
work page 2023
-
[5]
Arcface: Additive angular margin loss for deep face recognition
Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4690--4699, 2019
work page 2019
-
[6]
Retinaface: Single-shot multi-level face localisation in the wild
Jiankang Deng, Jia Guo, Evangelos Ververas, Irene Kotsia, and Stefanos Zafeiriou. Retinaface: Single-shot multi-level face localisation in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5203--5212, 2020
work page 2020
-
[8]
Hyperbolic metric learning for visual outlier detection
Alvaro Gonzalez-Jimenez, Simone Lionetti, Dena Bazazian, Philippe Gottfrois, Fabian Gr \"o ger, Alexander Navarini, and Marc Pouly. Hyperbolic metric learning for visual outlier detection. In European Conference on Computer Vision, pages 327--344. Springer, 2024
work page 2024
-
[9]
Clothes-changing person re-identification with rgb modality only
Xinqian Gu, Hong Chang, Bingpeng Ma, Shutao Bai, Shiguang Shan, and Xilin Chen. Clothes-changing person re-identification with rgb modality only. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1060--1069, 2022
work page 2022
-
[10]
Dimensionality reduction by learning an invariant mapping
Raia Hadsell, Sumit Chopra, and Yann LeCun. Dimensionality reduction by learning an invariant mapping. In 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06), volume 2, pages 1735--1742. IEEE, 2006
work page 2006
-
[11]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770--778, 2016
work page 2016
-
[12]
Transreid: Transformer-based object re-identification
Shuting He, Hao Luo, Pichao Wang, Fan Wang, Hao Li, and Wei Jiang. Transreid: Transformer-based object re-identification. In Proceedings of the IEEE/CVF international conference on computer vision, pages 15013--15022, 2021
work page 2021
-
[13]
Head360: Learning a parametric 3d full-head for free-view synthesis in 360 ^
Yuxiao He, Yiyu Zhuang, Yanwen Wang, Yao Yao, Siyu Zhu, Xiaoyu Li, Qi Zhang, Xun Cao, and Hao Zhu. Head360: Learning a parametric 3d full-head for free-view synthesis in 360 ^ . In European Conference on Computer Vision, pages 254--272. Springer, 2024
work page 2024
-
[14]
Rui Huang, Shu Zhang, Tianyu Li, and Ran He. Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis. In Proceedings of the IEEE international conference on computer vision, pages 2439--2448, 2017
work page 2017
-
[15]
Supervised contrastive learning
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. Supervised contrastive learning. Advances in neural information processing systems, 33: 0 18661--18673, 2020
work page 2020
-
[16]
Adaface: Quality adaptive margin for face recognition
Minchul Kim, Anil K Jain, and Xiaoming Liu. Adaface: Quality adaptive margin for face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 18750--18759, 2022
work page 2022
-
[17]
Hier: Metric learning beyond class labels via hierarchical regularization
Sungyeon Kim, Boseung Jeong, and Suha Kwak. Hier: Metric learning beyond class labels via hierarchical regularization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19903--19912, 2023
work page 2023
-
[18]
Partial face recognition: Alignment-free approach
Shengcai Liao, Anil K Jain, and Stan Z Li. Partial face recognition: Alignment-free approach. IEEE Transactions on pattern analysis and machine intelligence, 35 0 (5): 0 1193--1205, 2012
work page 2012
-
[19]
Microsoft coco: Common objects in context
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll \'a r, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740--755. Springer, 2014
work page 2014
-
[20]
No fuss distance metric learning using proxies
Yair Movshovitz-Attias, Alexander Toshev, Thomas K Leung, Sergey Ioffe, and Saurabh Singh. No fuss distance metric learning using proxies. In Proceedings of the IEEE international conference on computer vision, pages 360--368, 2017
work page 2017
-
[21]
Long-term cloth-changing person re-identification
Xuelin Qian, Wenxuan Wang, Li Zhang, Fangrui Zhu, Yanwei Fu, Tao Xiang, Yu-Gang Jiang, and Xiangyang Xue. Long-term cloth-changing person re-identification. In Proceedings of the Asian conference on computer vision, 2020
work page 2020
-
[23]
Facenet: A unified embedding for face recognition and clustering
Florian Schroff, Dmitry Kalenichenko, and James Philbin. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 815--823, 2015
work page 2015
-
[24]
First order motion model for image animation
Aliaksandr Siarohin, St \'e phane Lathuili \`e re, Sergey Tulyakov, Elisa Ricci, and Nicu Sebe. First order motion model for image animation. Advances in neural information processing systems, 32, 2019
work page 2019
-
[25]
Everybody’s talkin’: Let me talk as you want
Linsen Song, Wayne Wu, Chen Qian, Ran He, and Chen Change Loy. Everybody’s talkin’: Let me talk as you want. IEEE Transactions on Information Forensics and Security, 17: 0 585--598, 2022
work page 2022
-
[26]
Learning part-based convolutional features for person re-identification
Yifan Sun, Liang Zheng, Yali Li, Yi Yang, Qi Tian, and Shengjin Wang. Learning part-based convolutional features for person re-identification. IEEE transactions on pattern analysis and machine intelligence, 43 0 (3): 0 902--917, 2019
work page 2019
-
[27]
Disentangled representation learning gan for pose-invariant face recognition
Luan Tran, Xi Yin, and Xiaoming Liu. Disentangled representation learning gan for pose-invariant face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1415--1424, 2017
work page 2017
-
[28]
Video abstraction: A systematic review and classification
Ba Tu Truong and Svetha Venkatesh. Video abstraction: A systematic review and classification. ACM transactions on multimedia computing, communications, and applications (TOMM), 3 0 (1): 0 3--es, 2007
work page 2007
-
[29]
Occlusion robust face recognition based on mask learning
Weitao Wan and Jiansheng Chen. Occlusion robust face recognition based on mask learning. In 2017 IEEE international conference on image processing (ICIP), pages 3795--3799. IEEE, 2017
work page 2017
-
[30]
Learning discriminative features with multiple granularities for person re-identification
Guanshuo Wang, Yufeng Yuan, Xiong Chen, Jiwei Li, and Xi Zhou. Learning discriminative features with multiple granularities for person re-identification. In Proceedings of the 26th ACM international conference on Multimedia, pages 274--282, 2018 a
work page 2018
-
[31]
Cosface: Large margin cosine loss for deep face recognition
Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, and Wei Liu. Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5265--5274, 2018 b
work page 2018
-
[34]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Arcface: Additive angular margin loss for deep face recognition , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[35]
Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
Cosface: Large margin cosine loss for deep face recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
-
[36]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Adaface: Quality adaptive margin for face recognition , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[37]
Qwen3-omni technical report , author=. arXiv preprint arXiv:2509.17765 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[38]
Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
Deep residual learning for image recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
-
[39]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
An image is worth 16x16 words: Transformers for image recognition at scale , author=. arXiv preprint arXiv:2010.11929 , year=
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[40]
IEEE Transactions on pattern analysis and machine intelligence , volume=
Partial face recognition: Alignment-free approach , author=. IEEE Transactions on pattern analysis and machine intelligence , volume=. 2012 , publisher=
work page 2012
-
[41]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Partial fc: Training 10 million identities on a single machine , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[42]
IEEE transactions on pattern analysis and machine intelligence , volume=
Learning part-based convolutional features for person re-identification , author=. IEEE transactions on pattern analysis and machine intelligence , volume=. 2019 , publisher=
work page 2019
-
[43]
Proceedings of the 26th ACM international conference on Multimedia , pages=
Learning discriminative features with multiple granularities for person re-identification , author=. Proceedings of the 26th ACM international conference on Multimedia , pages=
-
[44]
Proceedings of the IEEE/CVF international conference on computer vision , pages=
Transreid: Transformer-based object re-identification , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=
-
[45]
Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
Disentangled representation learning gan for pose-invariant face recognition , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
-
[46]
Proceedings of the IEEE international conference on computer vision , pages=
Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis , author=. Proceedings of the IEEE international conference on computer vision , pages=
-
[47]
2017 IEEE international conference on image processing (ICIP) , pages=
Occlusion robust face recognition based on mask learning , author=. 2017 IEEE international conference on image processing (ICIP) , pages=. 2017 , organization=
work page 2017
-
[48]
Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
Facenet: A unified embedding for face recognition and clustering , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=
-
[49]
2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06) , volume=
Dimensionality reduction by learning an invariant mapping , author=. 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06) , volume=. 2006 , organization=
work page 2006
-
[50]
Proceedings of the IEEE international conference on computer vision , pages=
No fuss distance metric learning using proxies , author=. Proceedings of the IEEE international conference on computer vision , pages=
-
[51]
Advances in neural information processing systems , volume=
Supervised contrastive learning , author=. Advances in neural information processing systems , volume=
-
[52]
V oxceleb2: Deep speaker recognition,
Voxceleb2: Deep speaker recognition , author=. arXiv preprint arXiv:1806.05622 , year=
-
[53]
2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018) , pages=
Vggface2: A dataset for recognising faces across pose and age , author=. 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018) , pages=. 2018 , organization=
work page 2018
- [54]
-
[55]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Open-ended hierarchical streaming video understanding with vision language models , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[56]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Hier: Metric learning beyond class labels via hierarchical regularization , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[57]
European Conference on Computer Vision , pages=
Hyperbolic metric learning for visual outlier detection , author=. European Conference on Computer Vision , pages=. 2024 , organization=
work page 2024
-
[58]
Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
Hairnerf: Geometry-aware image synthesis for hairstyle transfer , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=
-
[59]
European Conference on Computer Vision , pages=
Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360 ^ , author=. European Conference on Computer Vision , pages=. 2024 , organization=
work page 2024
-
[60]
arXiv preprint arXiv:2106.11297 , year=
Tokenlearner: What can 8 learned tokens do for images and videos? , author=. arXiv preprint arXiv:2106.11297 , year=
-
[61]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Videomae v2: Scaling video masked autoencoders with dual masking , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[62]
Video-to-video synthesis , author=. arXiv preprint arXiv:1808.06601 , year=
-
[63]
IEEE Transactions on Information Forensics and Security , volume=
Everybody’s talkin’: Let me talk as you want , author=. IEEE Transactions on Information Forensics and Security , volume=. 2022 , publisher=
work page 2022
-
[64]
Advances in neural information processing systems , volume=
First order motion model for image animation , author=. Advances in neural information processing systems , volume=
-
[65]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Clothes-changing person re-identification with rgb modality only , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[66]
Proceedings of the Asian conference on computer vision , year=
Long-term cloth-changing person re-identification , author=. Proceedings of the Asian conference on computer vision , year=
-
[67]
ACM transactions on multimedia computing, communications, and applications (TOMM) , volume=
Video abstraction: A systematic review and classification , author=. ACM transactions on multimedia computing, communications, and applications (TOMM) , volume=. 2007 , publisher=
work page 2007
-
[68]
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
Retinaface: Single-shot multi-level face localisation in the wild , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=
-
[69]
European conference on computer vision , pages=
Microsoft coco: Common objects in context , author=. European conference on computer vision , pages=. 2014 , organization=
work page 2014
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.