Pose-dIVE: Pose-Diversified Augmentation with Diffusion Model for Person Re-Identification
Pith reviewed 2026-05-23 23:49 UTC · model grok-4.3
The pith
A diffusion model conditioned on SMPL-derived poses and viewpoints augments Re-ID training data to reduce pose and camera bias.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By conditioning the diffusion model on both the human pose and camera viewpoint through the SMPL model, the framework generates augmented training data with diverse human poses and camera viewpoints so that existing Re-ID models learn features unbiased by these variations and generalize better to new camera systems.
What carries the argument
Diffusion model conditioned on SMPL pose and viewpoint parameters, used to synthesize new training images that preserve identity while varying only pose and viewpoint.
If this is right
- Re-ID models trained on the augmented data learn features independent of pose and viewpoint.
- Generalization improves on datasets collected from previously unseen camera setups.
- The method outperforms prior data-augmentation techniques for Re-ID on standard benchmarks.
- The training distribution gains explicit coverage of sparse pose and viewpoint combinations.
Where Pith is reading between the lines
- The same conditioning approach could be tested on other recognition tasks where viewpoint or pose imbalance limits performance.
- Targeted generation of rare poses might reduce reliance on large-scale real-world data collection for Re-ID.
- Measuring the entropy of pose and viewpoint distributions before and after augmentation would quantify the claimed diversification effect.
Load-bearing premise
The generated images must keep the original person's identity intact and change only pose and viewpoint without creating artifacts that Re-ID models can exploit as shortcuts.
What would settle it
If the same identity verification network applied to original-versus-generated image pairs shows identity mismatch rates substantially higher than on real image pairs, or if Re-ID accuracy on pose-diverse test sets fails to rise after augmentation.
Figures
read the original abstract
Person re-identification (Re-ID) often faces challenges due to variations in human poses and camera viewpoints, which significantly affect the appearance of individuals across images. Existing datasets frequently lack diversity and scalability in these aspects, hindering the generalization of Re-ID models to new camera systems or environments. To overcome this, we propose Pose-dIVE, a novel data augmentation approach that incorporates sparse and underrepresented human pose and camera viewpoint examples into the training data, addressing the limited diversity in the original training data distribution. Our objective is to augment the training dataset to enable existing Re-ID models to learn features unbiased by human pose and camera viewpoint variations. By conditioning the diffusion model on both the human pose and camera viewpoint through the SMPL model, our framework generates augmented training data with diverse human poses and camera viewpoints. Experimental results demonstrate the effectiveness of our method in addressing human pose bias and enhancing the generalizability of Re-ID models compared to other data augmentation-based Re-ID approaches.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Pose-dIVE, a data augmentation method for person re-identification that employs a diffusion model conditioned on human pose and camera viewpoint through the SMPL model. The goal is to generate training samples with diverse poses and viewpoints to mitigate bias in Re-ID models caused by limited diversity in existing datasets. The abstract states that experiments demonstrate the method's effectiveness relative to other augmentation-based Re-ID approaches.
Significance. If the generated images preserve source identity while varying only pose and viewpoint, the approach could provide a scalable way to diversify Re-ID training data and improve model generalization across camera systems. The choice to leverage SMPL for explicit 3D control is a reasonable technical direction for pose-conditioned generation.
major comments (2)
- [Abstract] Abstract: The central claim requires that diffusion outputs retain source identity (clothing texture, facial details, appearance) while varying only pose and viewpoint. SMPL supplies 3D body parameters but encodes neither surface texture nor identity-specific cues; the manuscript describes no explicit identity-preserving mechanism such as reference-image cross-attention, perceptual loss, or feature-matching regularizer. This is load-bearing because without it the generated samples can introduce spurious identity cues that the downstream Re-ID model exploits, undermining the bias-reduction objective.
- [Abstract] Abstract: The claim that 'experimental results demonstrate the effectiveness' is made without any quantitative results, baselines, controls, or metrics for identity preservation (e.g., Re-ID feature similarity before/after augmentation or comparison against standard augmentations). This prevents verification of whether gains exceed trivial augmentation or whether identity is actually preserved.
minor comments (1)
- The abstract would be clearer if it included at least one key quantitative result (e.g., rank-1 accuracy improvement on a standard Re-ID benchmark) to support the effectiveness statement.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address the major comments point by point below and outline revisions to improve clarity and rigor.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim requires that diffusion outputs retain source identity (clothing texture, facial details, appearance) while varying only pose and viewpoint. SMPL supplies 3D body parameters but encodes neither surface texture nor identity-specific cues; the manuscript describes no explicit identity-preserving mechanism such as reference-image cross-attention, perceptual loss, or feature-matching regularizer. This is load-bearing because without it the generated samples can introduce spurious identity cues that the downstream Re-ID model exploits, undermining the bias-reduction objective.
Authors: We agree that explicit identity preservation is essential to ensure the augmentation varies only pose and viewpoint without introducing spurious cues. The manuscript conditions the diffusion model on SMPL parameters for pose and viewpoint but does not detail an additional identity-preserving component such as reference cross-attention or perceptual losses. We will revise the method section to explicitly describe the identity preservation strategy (e.g., by incorporating source-image conditioning) and add quantitative verification of identity retention. revision: yes
-
Referee: [Abstract] Abstract: The claim that 'experimental results demonstrate the effectiveness' is made without any quantitative results, baselines, controls, or metrics for identity preservation (e.g., Re-ID feature similarity before/after augmentation or comparison against standard augmentations). This prevents verification of whether gains exceed trivial augmentation or whether identity is actually preserved.
Authors: The abstract summarizes the experimental outcomes at a high level, while the full paper presents quantitative results, baselines, and comparisons in the experiments section. However, we acknowledge that the abstract lacks specific metrics for identity preservation. In the revision we will update the abstract to include key quantitative highlights and ensure identity-preservation metrics (such as feature similarity) are reported and discussed. revision: yes
Circularity Check
No significant circularity; generative pipeline evaluated externally
full rationale
The paper describes a conditional diffusion pipeline for pose/viewpoint augmentation in Re-ID, with success measured by downstream model accuracy on held-out datasets rather than by internal consistency with its own outputs. No equations, fitted parameters, or predictions are presented that reduce by construction to the inputs (e.g., no self-definitional ratios or renamed empirical patterns). Self-citations, if present, are not load-bearing for any uniqueness claim. The method is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption SMPL model accurately encodes human pose and camera viewpoint from 2D images
- domain assumption Diffusion models can generate identity-preserving images when conditioned on SMPL parameters
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
By conditioning the diffusion model on both the human pose and camera viewpoint through the SMPL model, our framework generates augmented training data with diverse human poses and camera viewpoints.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
we leverage the knowledge of pre-trained large-scale diffusion models... reference U-Net... pose guider network
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 3 Pith papers
-
An Empirical Study of Validating Synthetic Data for Text-Based Person Retrieval
Empirical study of a fully synthetic data generation pipeline for text-based person retrieval that tests its use as a replacement or augmentation for real data across scenarios.
-
SD-ReID: View-aware Stable Diffusion for Aerial-Ground Person Re-Identification
SD-ReID trains a ViT to extract identity and view conditions, fine-tunes Stable Diffusion to generate view-mimicking features, adds a View-Refined Decoder, and combines both identity and all-view features for retrieva...
-
ID-Sim: An Identity-Focused Similarity Metric
ID-Sim is a new similarity metric that aims to capture human selective sensitivity to identities by training on curated real and generative synthetic data and validating against human annotations on recognition, retri...
Reference graph
Works this paper leans on
-
[1]
, " * write output.state after.block = add.period write newline
ENTRY address archivePrefix author booktitle chapter edition editor eid eprint howpublished institution isbn journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.a...
-
[2]
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...
-
[3]
Bak, S.; Zaidenberg, S.; Boulay, B.; and Bremond, F. 2014. Improving person re-identification by viewpoint cues. In 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 175--180. IEEE
work page 2014
-
[4]
K.; Khan, S.; Cholakkal, H.; Anwer, R
Bhunia, A. K.; Khan, S.; Cholakkal, H.; Anwer, R. M.; Laaksonen, J.; Shah, M.; and Khan, F. S. 2023. Person image synthesis via denoising diffusion model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5968--5976
work page 2023
-
[5]
Chan, C.; Ginosar, S.; Zhou, T.; and Efros, A. A. 2019. Everybody dance now. In Proceedings of the IEEE/CVF international conference on computer vision, 5933--5942
work page 2019
-
[6]
Chen, W.; Xu, X.; Jia, J.; Luo, H.; Wang, Y.; Wang, F.; Jin, R.; and Sun, X. 2023. Beyond appearance: a semantic controllable self-supervised learning framework for human-centric visual tasks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15050--15061
work page 2023
-
[7]
Chen, X.; Fu, C.; Zhao, Y.; Zheng, F.; Song, J.; Ji, R.; and Yang, Y. 2020. Salience-guided cascaded suppression network for person re-identification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 3300--3310
work page 2020
-
[8]
Chen, Y.-C.; Zhu, X.; Zheng, W.-S.; and Lai, J.-H. 2018. Person Re-Identification by Camera Correlation Aware Feature Augmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 40(2)
work page 2018
-
[9]
Cho, Y.-J.; and Yoon, K.-J. 2016. Improving person re-identification via Pose-aware Multi-shot Matching. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, 1354--1362. IEEE Computer Society and the Computer Vision Foundation (CVF)
work page 2016
-
[10]
Co s ar, S.; and Bellotto, N. 2020. Human Re-identification with a robot thermal camera using entropy-based sampling. Journal of Intelligent & Robotic Systems, 98(1): 85--102
work page 2020
-
[11]
Dai, Z.; Chen, M.; Gu, X.; Zhu, S.; and Tan, P. 2019. Batch DropBlock Network for Person Re-Identification and Beyond. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 3690--3700. IEEE Computer Society
work page 2019
-
[12]
Ding, C.; Wang, K.; Wang, P.; and Tao, D. 2020. Multi-task learning with coarse priors for robust part-aware person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(3): 1474--1488
work page 2020
-
[13]
Fu, D.; Chen, D.; Bao, J.; Yang, H.; Yuan, L.; Zhang, L.; Li, H.; and Chen, D. 2021. Unsupervised pre-training for person re-identification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 14750--14759
work page 2021
-
[14]
Ge, Y.; Li, Z.; Zhao, H.; Yin, G.; Yi, S.; Wang, X.; et al. 2018. Fd-gan: Pose-guided feature distilling gan for robust person re-identification. Advances in neural information processing systems, 31
work page 2018
- [16]
-
[17]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; and Bengio, Y. 2020. Generative adversarial networks. Communications of the ACM, 63(11): 139--144
work page 2020
-
[18]
Gu, J.; Wang, K.; Luo, H.; Chen, C.; Jiang, W.; Fang, Y.; Zhang, S.; You, Y.; and Zhao, J. 2023. Msinet: Twins contrastive search of multi-scale interaction for object reid. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 19243--19253
work page 2023
-
[19]
Han, X.; Zhu, X.; Deng, J.; Song, Y.-Z.; and Xiang, T. 2023. Controllable person image synthesis with pose-constrained latent diffusion. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 22768--22777
work page 2023
-
[20]
He, S.; Luo, H.; Wang, P.; Wang, F.; Li, H.; and Jiang, W. 2021 a . TransReID: Transformer-Based Object Re-Identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 15013--15022
work page 2021
-
[21]
He, T.; Jin, X.; Shen, X.; Huang, J.; Chen, Z.; and Hua, X.-S. 2021 b . Dense interaction learning for video-based person re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 1490--1501
work page 2021
-
[22]
M.; K \"o stinger, M.; and Bischof, H
Hirzer, M.; Roth, P. M.; K \"o stinger, M.; and Bischof, H. 2012. Relaxed pairwise learned metric for person re-identification. In Computer Vision--ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part VI 12, 780--793. Springer
work page 2012
-
[23]
Ho, J.; Jain, A.; and Abbeel, P. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33: 6840--6851
work page 2020
-
[24]
Hoffer, E.; and Ailon, N. 2015. Deep metric learning using triplet network. In Similarity-Based Pattern Recognition: Third International Workshop, SIMBAD 2015, Copenhagen, Denmark, October 12-14, 2015. Proceedings 3, 84--92. Springer
work page 2015
- [25]
-
[26]
Huang, H.; Li, D.; Zhang, Z.; Chen, X.; and Huang, K. 2018. Adversarially Occluded Samples for Person Re-identification. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5098--5107. IEEE Computer Society
work page 2018
-
[27]
Jin, X.; Lan, C.; Zeng, W.; Wei, G.; and Chen, Z. 2020. Semantics-aligned representation learning for person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 11173--11180
work page 2020
-
[28]
Kapil, S. 2021. Locally Aware Transformer for Person Re-Identification. Master's thesis, University of Maryland, Baltimore County
work page 2021
-
[29]
Karanam, S.; Li, Y.; and Radke, R. J. 2015. Person Re-Identification with Discriminatively Trained Viewpoint Invariant Dictionaries. In 2015 IEEE International Conference on Computer Vision (ICCV), 4516--4524. IEEE
work page 2015
-
[30]
Karras, J.; Holynski, A.; Wang, T.-C.; and Kemelmacher-Shlizerman, I. 2023. Dreampose: Fashion image-to-video synthesis via stable diffusion. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV), 22623--22633. IEEE
work page 2023
-
[31]
Adam: A Method for Stochastic Optimization
Kingma, D. P.; and Ba, J. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[32]
Koestinger, M.; Hirzer, M.; Wohlhart, P.; Roth, P. M.; and Bischof, H. 2012. Large scale metric learning from equivalence constraints. In 2012 IEEE conference on computer vision and pattern recognition, 2288--2295. IEEE
work page 2012
-
[33]
Li, S.; Sun, L.; and Li, Q. 2023. CLIP-ReID: exploiting vision-language model for image re-identification without concrete text labels. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, 1405--1413
work page 2023
-
[34]
Li, W.; Zhao, R.; Xiao, T.; and Wang, X. 2014. Deepreid: Deep filter pairing neural network for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recognition, 152--159
work page 2014
-
[35]
Liao, S.; and Li, S. Z. 2015. Efficient psd constrained asymmetric metric learning for person re-identification. In Proceedings of the IEEE international conference on computer vision, 3685--3693
work page 2015
-
[36]
Liu, J.; Ni, B.; Yan, Y.; Zhou, P.; Cheng, S.; and Hu, J. 2018. Pose Transferrable Person Re-identification. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE
work page 2018
-
[37]
Liu, X.; Song, M.; Tao, D.; Zhou, X.; Chen, C.; and Bu, J. 2014. Semi-supervised Coupled Dictionary Learning for Person Re-identification. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3550--3557. IEEE Computer Society
work page 2014
-
[38]
Loper, M.; Mahmood, N.; Romero, J.; Pons-Moll, G.; and Black, M. J. 2015. SMPL: A Skinned Multi-Person Linear Model. Acm Transactions on Graphics, 34(Article 248)
work page 2015
-
[39]
Loper, M.; Mahmood, N.; Romero, J.; Pons-Moll, G.; and Black, M. J. 2023. SMPL: A skinned multi-person linear model. In Seminal Graphics Papers: Pushing the Boundaries, Volume 2, 851--866
work page 2023
-
[40]
Luo, C.; Song, C.; and Zhang, Z. 2020. Generalizing person re-identification by camera-aware invariance learning and cross-domain mixup. In Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XV 16, 224--241. Springer
work page 2020
- [41]
-
[42]
McLaughlin, N.; Del Rincon, J. M.; and Miller, P. 2015. Data-augmentation for reducing dataset bias in person re-identification. In 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 1--6. IEEE Computer Society
work page 2015
-
[43]
Ni, X.; and Rahtu, E. 2021. Flipreid: closing the gap between training and inference in person re-identification. In 2021 9th European Workshop on Visual Information Processing (EUVIP), 1--6. IEEE
work page 2021
-
[44]
Qian, X.; Fu, Y.; Xiang, T.; Wang, W.; Qiu, J.; Wu, Y.; Jiang, Y.-G.; and Xue, X. 2018. Pose-normalized image generation for person re-identification. In Proceedings of the European conference on computer vision (ECCV), 650--667
work page 2018
-
[45]
Radford, A.; Kim, J. W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning, 8748--8763. PMLR
work page 2021
-
[46]
Rao, Y.; Chen, G.; Lu, J.; and Zhou, J. 2021. Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-Identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 1025--1034
work page 2021
-
[47]
Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; and Ommer, B. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 10684--10695
work page 2022
-
[48]
Ronneberger, O.; Fischer, P.; and Brox, T. 2015. U-net: Convolutional networks for biomedical image segmentation. In Medical image computing and computer-assisted intervention--MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, 234--241. Springer
work page 2015
-
[49]
S.; Schumann, A.; Eberle, A.; and Stiefelhagen, R
Sarfraz, M. S.; Schumann, A.; Eberle, A.; and Stiefelhagen, R. 2018. A pose-sensitive embedding for person re-identification with expanded cross neighborhood re-ranking. In Proceedings of the IEEE conference on computer vision and pattern recognition, 420--429
work page 2018
-
[50]
Somers, V.; De Vleeschouwer, C.; and Alahi, A. 2023. Body Part-Based Representation Learning for Occluded Person Re-Identification. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 1613--1623
work page 2023
-
[51]
Song, J.; Meng, C.; and Ermon, S. 2020. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[52]
Tang, H.; Bai, S.; Zhang, L.; Torr, P. H.; and Sebe, N. 2020. Xinggan for person image generation. In Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28, 2020, Proceedings, Part XXV 16, 717--734. Springer
work page 2020
-
[53]
Van der Maaten, L.; and Hinton, G. 2008. Visualizing data using t-SNE. Journal of machine learning research, 9(11)
work page 2008
-
[54]
Wang, G.; Lai, J.; Huang, P.; and Xie, X. 2019. Spatial-temporal person re-identification. In Proceedings of the AAAI conference on artificial intelligence, volume 33, 8933--8940
work page 2019
-
[55]
Wang, T.; Liu, H.; Song, P.; Guo, T.; and Shi, W. 2022. Pose-guided feature disentangling for occluded person re-identification based on transformer. In Proceedings of the AAAI conference on artificial intelligence, volume 36, 2540--2549
work page 2022
-
[56]
Wang, X. 2013. Intelligent multi-camera video surveillance: A review. Pattern recognition letters, 34(1): 3--19
work page 2013
-
[57]
Wei, L.; Zhang, S.; Gao, W.; and Tian, Q. 2018. Person transfer gan to bridge domain gap for person re-identification. In Proceedings of the IEEE conference on computer vision and pattern recognition, 79--88
work page 2018
-
[58]
Wieczorek, M.; Rychalska, B.; and Dabrowski, J. 2021. On the unreasonable effectiveness of centroids in image retrieval. In Neural Information Processing: 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8--12, 2021, Proceedings, Part IV 28, 212--223. Springer
work page 2021
-
[59]
Xiong, F.; Gou, M.; Camps, O.; and Sznaier, M. 2014. Person re-identification using kernel-based metric learning methods. In Computer Vision--ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VII 13, 1--16. Springer
work page 2014
-
[60]
H.; Yan, H.; Liu, J.-W.; Zhang, C.; Feng, J.; and Shou, M
Xu, Z.; Zhang, J.; Liew, J. H.; Yan, H.; Liu, J.-W.; Zhang, C.; Feng, J.; and Shou, M. Z. 2023. Magicanimate: Temporally consistent human image animation using diffusion model. arXiv preprint arXiv:2311.16498
-
[61]
Ye, M.; Shen, J.; Lin, G.; Xiang, T.; Shao, L.; and Hoi, S. C. 2021. Deep learning for person re-identification: A survey and outlook. IEEE transactions on pattern analysis and machine intelligence, 44(6): 2872--2893
work page 2021
-
[62]
Yu, H.-X.; Wu, A.; and Zheng, W.-S. 2018. Unsupervised person re-identification by deep asymmetric metric embedding. IEEE transactions on pattern analysis and machine intelligence, 42(4): 956--973
work page 2018
-
[63]
Vector-quantized Image Modeling with Improved VQGAN
Yu, J.; Li, X.; Koh, J. Y.; Zhang, H.; Pang, R.; Qin, J.; Ku, A.; Xu, Y.; Baldridge, J.; and Wu, Y. 2021. Vector-quantized image modeling with improved vqgan. arXiv preprint arXiv:2110.04627
work page internal anchor Pith review Pith/arXiv arXiv 2021
- [64]
-
[65]
Zang, X.; Li, G.; Gao, W.; and Shu, X. 2021. Learning to disentangle scenes for person re-identification. Image and Vision Computing, 116: 104330
work page 2021
-
[66]
Zhang, L.; Rao, A.; and Agrawala, M. 2023. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 3836--3847
work page 2023
-
[67]
Zhang, P.; Yang, L.; Lai, J.-H.; and Xie, X. 2022. Exploring dual-task correlation for pose guided person image generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7713--7722
work page 2022
-
[68]
Zhao, H.; Tian, M.; Sun, S.; Shao, J.; Yan, J.; Yi, S.; Wang, X.; and Tang, X. 2017. Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE
work page 2017
-
[69]
Zhao, R.; Ouyang, W.; and Wang, X. 2013. Unsupervised Salience Learning for Person Re-identification. In 2013 IEEE Conference on Computer Vision and Pattern Recognition
work page 2013
-
[70]
Zheng, L.; Bie, Z.; Sun, Y.; Wang, J.; Su, C.; Wang, S.; and Tian, Q. 2016. Mars: A video benchmark for large-scale person re-identification. In Computer Vision--ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VI 14, 868--884. Springer
work page 2016
-
[71]
Zheng, L.; Shen, L.; Tian, L.; Wang, S.; Wang, J.; and Tian, Q. 2015. Scalable Person Re-identification: A Benchmark. In Computer Vision, IEEE International Conference on Computer Vision, 1116--1124
work page 2015
-
[72]
Zheng, L.; Yang, Y.; and Hauptmann, A. G. 2016. Person re-identification: Past, present and future. arXiv preprint arXiv:1610.02984
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[73]
Zheng, L.; Yang, Y.; and Tian, Q. 2017. SIFT meets CNN: A decade survey of instance retrieval. IEEE transactions on pattern analysis and machine intelligence, 40(5): 1224--1244
work page 2017
-
[74]
Zheng, W.-S.; Gong, S.; and Xiang, T. 2011. Person re-identification by probabilistic relative distance comparison. In CVPR 2011, 649--656. IEEE
work page 2011
-
[75]
Zheng, Z.; Yang, X.; Yu, Z.; Zheng, L.; Yang, Y.; and Kautz, J. 2019. Joint discriminative and generative learning for person re-identification. In proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2138--2147
work page 2019
-
[76]
Zheng, Z.; Zheng, L.; and Yang, Y. 2017. Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro. In 2017 IEEE International Conference on Computer Vision (ICCV). IEEE
work page 2017
-
[77]
Zhong, Z.; Zheng, L.; Kang, G.; Li, S.; and Yang, Y. 2020. Random Erasing Data Augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 13001--13008
work page 2020
-
[78]
Zhong, Z.; Zheng, L.; Zheng, Z.; Li, S.; and Yang, Y. 2018. Camera Style Adaptation for Person Re-identification. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
work page 2018
-
[79]
Zhu, K.; Guo, H.; Zhang, S.; Wang, Y.; Liu, J.; Wang, J.; and Tang, M. 2023. Aaformer: Auto-aligned transformer for person re-identification. IEEE Transactions on Neural Networks and Learning Systems
work page 2023
-
[80]
L.; Dai, Z.; Xu, Y.; Cao, X.; Yao, Y.; Zhu, H.; and Zhu, S
Zhu, S.; Chen, J. L.; Dai, Z.; Xu, Y.; Cao, X.; Yao, Y.; Zhu, H.; and Zhu, S. 2024. Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance. arXiv preprint arXiv:2403.14781
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.