Resolution-invariant Person Re-Identification

Ming Yang; Shiliang Zhang; Shunan Mao

arxiv: 1906.09748 · v2 · pith:FQ77ESCHnew · submitted 2019-06-24 · 💻 cs.CV

Resolution-invariant Person Re-Identification

Shunan Mao , Shiliang Zhang , Ming Yang This is my paper

Pith reviewed 2026-05-25 17:56 UTC · model grok-4.3

classification 💻 cs.CV

keywords person re-identificationresolution invariancesuper-resolutionfeature extractionconvolutional neural networkattention mechanismlow-resolution imagesend-to-end training

0 comments

The pith

Jointly training a foreground super-resolution module and a dual-attention feature extractor produces person representations robust to large resolution differences.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to make person re-identification work when input images vary widely in resolution, a common issue in real camera networks. It trains a super-resolution component that sharpens only the person foreground and a feature extractor with separate low- and high-resolution paths that are combined via attention, all inside one end-to-end network. A sympathetic reader would care because this removes the need to preprocess or retrain separately for each camera quality. Experiments on five datasets show the resulting features improve matching accuracy, especially when low-resolution images are involved.

Core claim

The central claim is that end-to-end CNN training of the Foreground-Focus Super-Resolution module, built as a fully convolutional auto-encoder with skip connections and trained under a foreground focus loss, together with the Resolution-Invariant Feature Extractor that runs two streams weighted by a dual-attention block, produces a representation whose matching performance stays strong across large resolution changes.

What carries the argument

Joint end-to-end training of the FFSR foreground super-resolution module and the RIFE dual-stream feature extractor with dual-attention weighting.

If this is right

Rank-1 accuracy reaches 36.4 percent on CAVIAR and 73.3 percent on MLR-CUHK03, exceeding prior methods by 2.9 and 2.6 percentage points.
The same trained model handles both low- and high-resolution inputs without separate branches or preprocessing.
The learned features show consistent gains across five datasets that together cover a large range of resolutions.
The approach removes the requirement for resolution-specific data collection or model retraining in deployment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same joint-training pattern could be tested on other matching tasks such as vehicle re-identification where camera resolution also varies.
Foreground focus during super-resolution might reduce the impact of background clutter in crowded scenes beyond the re-identification setting.
If the invariance proves stable, it could lower the volume of high-resolution training images needed for new camera networks.
Deployment in uncontrolled multi-camera environments would provide a direct test of whether dataset-specific retuning is truly unnecessary.

Load-bearing premise

The foreground focus loss and dual-attention weighting produce resolution-invariant features that hold outside the five evaluated datasets and do not require dataset-specific tuning of the joint objective.

What would settle it

Performance on a sixth dataset containing person images across a wide resolution range falls to or below the accuracy of prior state-of-the-art methods.

Figures

Figures reproduced from arXiv: 1906.09748 by Ming Yang, Shiliang Zhang, Shunan Mao.

**Figure 2.** Figure 2: Values of object function O in Eq. (2) computed with variations of resolution on MSMT17 and Market1501. (a) fixes r1 = r2 and increase r1 and r2 from 0.125 to 1. (b) fixes r2 = 1 and increase r1 from 0.125 to 1. It verifies that, both low resolution and varied resolution increase the difficulty of person ReID. where k · k2 2 computes the distance between feature vectors. Ddif (·) and Dsim(·) compute the di… view at source ↗

**Figure 3.** Figure 3: The architecture of our network, which consists of two modules: Foreground-Focus Super-Resolution (FFSR) and Resolution [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Effects of FFSR and RIFT to the object function [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Sample results of person ReID and super resolution on [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

read the original abstract

Exploiting resolution invariant representation is critical for person Re-Identification (ReID) in real applications, where the resolutions of captured person images may vary dramatically. This paper learns person representations robust to resolution variance through jointly training a Foreground-Focus Super-Resolution (FFSR) module and a Resolution-Invariant Feature Extractor (RIFE) by end-to-end CNN learning. FFSR upscales the person foreground using a fully convolutional auto-encoder with skip connections learned with a foreground focus training loss. RIFE adopts two feature extraction streams weighted by a dual-attention block to learn features for low and high resolution images, respectively. These two complementary modules are jointly trained, leading to a strong resolution invariant representation. We evaluate our methods on five datasets containing person images at a large range of resolutions, where our methods show substantial superiority to existing solutions. For instance, we achieve Rank-1 accuracy of 36.4% and 73.3% on CAVIAR and MLR-CUHK03, outperforming the state-of-the art by 2.9% and 2.6%, respectively.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Joint FFSR and RIFE training yields small gains on low-res ReID benchmarks but the invariance claim lacks direct support beyond the five tested datasets.

read the letter

The key point is that this paper trains a foreground-focused super-resolution module and a dual-attention feature extractor together end-to-end, then reports Rank-1 lifts of 2.9% on CAVIAR and 2.6% on MLR-CUHK03 over prior work on five datasets with mixed resolutions. That combination is the concrete new element, and the numbers show it beats existing solutions on those specific low-resolution cases. The foreground focus loss and skip-connection auto-encoder plus the two-stream attention setup are presented as the mechanism that produces the invariant features. The work does address a practical ReID deployment issue where camera resolutions differ, and the reported margins are positive though modest. The soft spots sit in the missing pieces: the abstract supplies no training protocol, no error bars, no ablation tables, and no cross-dataset transfer results. The load-bearing step—that the joint objective creates resolution invariance that holds without dataset-specific retuning—rests on the assumption that the foreground loss and dual attention generalize, but nothing in the given text tests that directly. The gains could partly reflect per-dataset tuning rather than a robust property. This paper is for people already working on person ReID who need to handle real camera variation. A reader in that subfield could extract the pipeline idea and test it themselves. It deserves peer review because the core problem is real and the method is spelled out enough to replicate and extend, even if the current evidence is narrow.

Referee Report

2 major / 1 minor

Summary. The manuscript claims that jointly training a Foreground-Focus Super-Resolution (FFSR) module—an auto-encoder with skip connections trained via a foreground-focus loss—and a Resolution-Invariant Feature Extractor (RIFE) with dual-attention weighted streams for low- and high-resolution images produces a strong resolution-invariant representation for person ReID. It reports concrete gains on five datasets spanning large resolution ranges, including Rank-1 accuracies of 36.4% on CAVIAR (+2.9% over prior art) and 73.3% on MLR-CUHK03 (+2.6%).

Significance. If the joint objective demonstrably yields transferable invariance, the approach could meaningfully improve ReID robustness in practical surveillance settings with variable camera resolutions. The multi-dataset evaluation with explicit margins over baselines provides a starting point for assessing utility, though the absence of isolating experiments leaves the source of the gains unclear.

major comments (2)

[Abstract] Abstract: the central claim that end-to-end joint training of FFSR and RIFE yields a 'strong resolution invariant representation' rests on aggregate Rank-1 improvements across five datasets, yet no ablation isolating the joint objective, no cross-dataset transfer results, and no analysis of the foreground-focus loss or dual-attention weighting are referenced to show that invariance holds without dataset-specific retuning.
[Abstract] Abstract, reported results on CAVIAR and MLR-CUHK03: the +2.9% and +2.6% margins are presented without training details, error bars, baseline implementations, or component-wise ablations, preventing verification that the gains derive from resolution invariance rather than per-dataset hyperparameter effects or the super-resolution module alone.

minor comments (1)

The abstract provides no information on network architectures, loss weighting, optimization schedule, or dataset splits, which are standard for reproducibility in CNN-based ReID papers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive feedback. We address the major comments point by point below, indicating where revisions to the manuscript will be made to improve clarity and support for the claims.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that end-to-end joint training of FFSR and RIFE yields a 'strong resolution invariant representation' rests on aggregate Rank-1 improvements across five datasets, yet no ablation isolating the joint objective, no cross-dataset transfer results, and no analysis of the foreground-focus loss or dual-attention weighting are referenced to show that invariance holds without dataset-specific retuning.

Authors: The evaluation spans five datasets with large resolution variations using a single consistent model and training protocol, which supports transferability of the invariance without per-dataset retuning. We agree the abstract would benefit from explicit pointers to the component analyses already present in the body (comparisons isolating joint training, foreground-focus loss, and dual-attention). We will revise the abstract accordingly and ensure the multi-dataset results are framed as evidence of invariance. revision: partial
Referee: [Abstract] Abstract, reported results on CAVIAR and MLR-CUHK03: the +2.9% and +2.6% margins are presented without training details, error bars, baseline implementations, or component-wise ablations, preventing verification that the gains derive from resolution invariance rather than per-dataset hyperparameter effects or the super-resolution module alone.

Authors: We will add a dedicated section or expanded supplementary material with training hyper-parameters, baseline re-implementation details, and component-wise ablations that isolate the joint objective from the super-resolution module alone. Single-run results are standard in the literature; we can note this limitation and, if feasible, report variability from additional seeds in the revision. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical method with independent evaluation

full rationale

The paper describes a joint end-to-end training procedure for an FFSR auto-encoder module and a dual-stream RIFE feature extractor, with performance measured by Rank-1 accuracy on five external datasets. No equations, fitted parameters renamed as predictions, self-citations used as load-bearing uniqueness theorems, or ansatzes smuggled via prior work appear in the provided text. The reported gains are presented as outcomes of the proposed architecture and loss, not as quantities forced by construction from the inputs themselves. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 2 invented entities

Abstract introduces two new modules whose effectiveness rests on unstated assumptions about the foreground loss and attention weighting; no explicit free parameters or external axioms are named.

invented entities (2)

Foreground-Focus Super-Resolution (FFSR) module no independent evidence
purpose: Upscale person foreground via fully convolutional auto-encoder with skip connections and foreground focus loss
New module proposed for the joint training pipeline
Resolution-Invariant Feature Extractor (RIFE) no independent evidence
purpose: Extract features via two weighted streams for low- and high-resolution images using dual-attention block
New extractor proposed to complement FFSR

pith-pipeline@v0.9.0 · 5721 in / 1158 out tokens · 26756 ms · 2026-05-25T17:56:46.551512+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

jointly training a Foreground-Focus Super-Resolution (FFSR) module and a Resolution-Invariant Feature Extractor (RIFE) by end-to-end CNN learning... foreground focus training loss... dual-attention block... resolution weighting loss LR
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We evaluate our methods on five datasets containing person images at a large range of resolutions

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

[1]

Learning resolution-invariant deep representations for person re-identiﬁcation

[Chen et al., 2019] Yun-Chun Chen, Yu-Jhe Li, Xiao-fei Du, and Yu-Chiang Frank Wang. Learning resolution-invariant deep representations for person re-identiﬁcation. In AAAI,

work page 2019
[2]

Cus- tom pictorial structures for re-identiﬁcation

[Cheng et al., 2011] Dong Seon Cheng, Marco Cristani, Michele Stoppa, Loris Bazzani, and Vittorio Murino. Cus- tom pictorial structures for re-identiﬁcation. In BMVC. Citeseer,

work page 2011
[3]

Learning a deep convolutional net- work for image super-resolution

[Dong et al., 2014] Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. Learning a deep convolutional net- work for image super-resolution. In ECCV, pages 184–

work page 2014
[4]

Gen- erative adversarial nets

[Goodfellow et al., 2014] Ian Goodfellow, Jean Pouget- Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Gen- erative adversarial nets. In NIPS, pages 2672–2680,

work page 2014
[5]

Viewpoint invariant pedestrian recognition with an ensemble of local- ized features

[Gray and Tao, 2008] Douglas Gray and Hai Tao. Viewpoint invariant pedestrian recognition with an ensemble of local- ized features. In ECCV, pages 262–275. Springer,

work page 2008
[6]

Deep residual learning for image recog- nition

[He et al., 2016] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recog- nition. In CVPR, pages 770–778,

work page 2016
[7]

Squeeze- and-excitation networks

[Hu et al., 2018] Jie Hu, Li Shen, and Gang Sun. Squeeze- and-excitation networks. In CVPR, pages 7132–7141,

work page 2018
[8]

Weinberger

[Huang et al., 2017] Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q. Weinberger. Densely con- nected convolutional networks. In CVPR, pages 2261– 2269,

work page 2017
[9]

Deepercut: A deeper, stronger, and faster multi-person pose estimation model

[Insafutdinov et al., 2016] Eldar Insafutdinov, Leonid Pishchulin, Bjoern Andres, Mykhaylo Andriluka, and Bernt Schiele. Deepercut: A deeper, stronger, and faster multi-person pose estimation model. In ECCV, pages 34–50. Springer,

work page 2016
[10]

Spatial transformer networks

[Jaderberg et al., 2015] Max Jaderberg, Karen Simonyan, Andrew Zisserman, et al. Spatial transformer networks. In NIPS, pages 2017–2025,

work page 2015
[11]

Deep low-resolution person re-identiﬁcation

[Jiao et al., 2018] Jiening Jiao, Wei-Shi Zheng, Ancong Wu, Xiatian Zhu, and Shaogang Gong. Deep low-resolution person re-identiﬁcation. AAAI,

work page 2018
[12]

Super-resolution person re-identiﬁcation with semi-coupled low-rank discriminant dictionary learning

[Jing et al., 2015] Xiao-Yuan Jing, Xiaoke Zhu, Fei Wu, Xinge You, Qinglong Liu, Dong Yue, Ruimin Hu, and Baowen Xu. Super-resolution person re-identiﬁcation with semi-coupled low-rank discriminant dictionary learning. In CVPR, pages 695–704,

work page 2015
[13]

Accurate image super-resolution using very deep convolutional networks

[Kim et al., 2016] Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. Accurate image super-resolution using very deep convolutional networks. In CVPR, pages 1646–1654,

work page 2016
[14]

Photo-realistic single image super- resolution using a generative adversarial network

[Ledig et al., 2017] Christian Ledig, Lucas Theis, Ferenc Husz´ar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew P Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. Photo-realistic single image super- resolution using a generative adversarial network. In CVPR, volume 2, page 4,

work page 2017
[15]

Deepreid: Deep ﬁlter pairing neural network for person re-identiﬁcation

[Li et al., 2014] Wei Li, Rui Zhao, Tong Xiao, and Xiaogang Wang. Deepreid: Deep ﬁlter pairing neural network for person re-identiﬁcation. In CVPR, pages 152–159,

work page 2014
[16]

Multi-scale learning for low-resolution person re-identiﬁcation

[Li et al., 2015] Xiang Li, Wei-Shi Zheng, Xiaojuan Wang, Tao Xiang, and Shaogang Gong. Multi-scale learning for low-resolution person re-identiﬁcation. In ICCV, pages 3765–3773,

work page 2015
[17]

Multi-scale 3d convolution network for video based person re-identiﬁcation

[Li et al., 2019] Jianing Li, Shiliang Zhang, and Tiejun Huang. Multi-scale 3d convolution network for video based person re-identiﬁcation. In AAAI,

work page 2019
[18]

Image restoration using very deep convolu- tional encoder-decoder networks with symmetric skip con- nections

[Mao et al., 2016] Xiaojiao Mao, Chunhua Shen, and Yu- Bin Yang. Image restoration using very deep convolu- tional encoder-decoder networks with symmetric skip con- nections. In NIPS, pages 2802–2810,

work page 2016
[19]

U-net: Convolutional networks for biomedical image segmentation

[Ronneberger et al., 2015] Olaf Ronneberger, Philipp Fis- cher, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In MICCAI, pages 234–241. Springer,

work page 2015
[20]

Imagenet large scale visual recogni- tion challenge

[Russakovsky et al., 2015] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. Imagenet large scale visual recogni- tion challenge. International Journal of Computer Vision , 115(3):211–252,

work page 2015
[21]

Image super-resolution via deep recursive residual net- work

[Tai et al., 2017] Ying Tai, Jian Yang, and Xiaoming Liu. Image super-resolution via deep recursive residual net- work. In CVPR, volume 1, page 5,

work page 2017
[22]

Scale-adaptive low- resolution person re-identiﬁcation via learning a discrimi- nating surface

[Wang et al., 2016] Zheng Wang, Ruimin Hu, Yi Yu, Junjun Jiang, Chao Liang, and Jinqiao Wang. Scale-adaptive low- resolution person re-identiﬁcation via learning a discrimi- nating surface. In IJCAI, pages 2669–2675,

work page 2016
[23]

Person transfer gan to bridge domain gap for person re-identiﬁcation

[Wei et al., 2018] Longhui Wei, Shiliang Zhang, Wen Gao, and Qi Tian. Person transfer gan to bridge domain gap for person re-identiﬁcation. In CVPR, pages 79–88,

work page 2018
[24]

Super-resolving very low-resolution face images with supplementary attributes

[Yu et al., 2018] Xin Yu, Basura Fernando, Richard Hartley, and Fatih Porikli. Super-resolving very low-resolution face images with supplementary attributes. In CVPR, pages 908–917,

work page 2018
[25]

Scalable person re-identiﬁcation: A benchmark

[Zheng et al., 2015] Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. Scalable person re-identiﬁcation: A benchmark. In ICCV, pages 1116–1124, 2015

work page 2015

[1] [1]

Learning resolution-invariant deep representations for person re-identiﬁcation

[Chen et al., 2019] Yun-Chun Chen, Yu-Jhe Li, Xiao-fei Du, and Yu-Chiang Frank Wang. Learning resolution-invariant deep representations for person re-identiﬁcation. In AAAI,

work page 2019

[2] [2]

Cus- tom pictorial structures for re-identiﬁcation

[Cheng et al., 2011] Dong Seon Cheng, Marco Cristani, Michele Stoppa, Loris Bazzani, and Vittorio Murino. Cus- tom pictorial structures for re-identiﬁcation. In BMVC. Citeseer,

work page 2011

[3] [3]

Learning a deep convolutional net- work for image super-resolution

[Dong et al., 2014] Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. Learning a deep convolutional net- work for image super-resolution. In ECCV, pages 184–

work page 2014

[4] [4]

Gen- erative adversarial nets

[Goodfellow et al., 2014] Ian Goodfellow, Jean Pouget- Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Gen- erative adversarial nets. In NIPS, pages 2672–2680,

work page 2014

[5] [5]

Viewpoint invariant pedestrian recognition with an ensemble of local- ized features

[Gray and Tao, 2008] Douglas Gray and Hai Tao. Viewpoint invariant pedestrian recognition with an ensemble of local- ized features. In ECCV, pages 262–275. Springer,

work page 2008

[6] [6]

Deep residual learning for image recog- nition

[He et al., 2016] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recog- nition. In CVPR, pages 770–778,

work page 2016

[7] [7]

Squeeze- and-excitation networks

[Hu et al., 2018] Jie Hu, Li Shen, and Gang Sun. Squeeze- and-excitation networks. In CVPR, pages 7132–7141,

work page 2018

[8] [8]

Weinberger

[Huang et al., 2017] Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q. Weinberger. Densely con- nected convolutional networks. In CVPR, pages 2261– 2269,

work page 2017

[9] [9]

Deepercut: A deeper, stronger, and faster multi-person pose estimation model

[Insafutdinov et al., 2016] Eldar Insafutdinov, Leonid Pishchulin, Bjoern Andres, Mykhaylo Andriluka, and Bernt Schiele. Deepercut: A deeper, stronger, and faster multi-person pose estimation model. In ECCV, pages 34–50. Springer,

work page 2016

[10] [10]

Spatial transformer networks

[Jaderberg et al., 2015] Max Jaderberg, Karen Simonyan, Andrew Zisserman, et al. Spatial transformer networks. In NIPS, pages 2017–2025,

work page 2015

[11] [11]

Deep low-resolution person re-identiﬁcation

[Jiao et al., 2018] Jiening Jiao, Wei-Shi Zheng, Ancong Wu, Xiatian Zhu, and Shaogang Gong. Deep low-resolution person re-identiﬁcation. AAAI,

work page 2018

[12] [12]

Super-resolution person re-identiﬁcation with semi-coupled low-rank discriminant dictionary learning

[Jing et al., 2015] Xiao-Yuan Jing, Xiaoke Zhu, Fei Wu, Xinge You, Qinglong Liu, Dong Yue, Ruimin Hu, and Baowen Xu. Super-resolution person re-identiﬁcation with semi-coupled low-rank discriminant dictionary learning. In CVPR, pages 695–704,

work page 2015

[13] [13]

Accurate image super-resolution using very deep convolutional networks

[Kim et al., 2016] Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. Accurate image super-resolution using very deep convolutional networks. In CVPR, pages 1646–1654,

work page 2016

[14] [14]

Photo-realistic single image super- resolution using a generative adversarial network

[Ledig et al., 2017] Christian Ledig, Lucas Theis, Ferenc Husz´ar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew P Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. Photo-realistic single image super- resolution using a generative adversarial network. In CVPR, volume 2, page 4,

work page 2017

[15] [15]

Deepreid: Deep ﬁlter pairing neural network for person re-identiﬁcation

[Li et al., 2014] Wei Li, Rui Zhao, Tong Xiao, and Xiaogang Wang. Deepreid: Deep ﬁlter pairing neural network for person re-identiﬁcation. In CVPR, pages 152–159,

work page 2014

[16] [16]

Multi-scale learning for low-resolution person re-identiﬁcation

[Li et al., 2015] Xiang Li, Wei-Shi Zheng, Xiaojuan Wang, Tao Xiang, and Shaogang Gong. Multi-scale learning for low-resolution person re-identiﬁcation. In ICCV, pages 3765–3773,

work page 2015

[17] [17]

Multi-scale 3d convolution network for video based person re-identiﬁcation

[Li et al., 2019] Jianing Li, Shiliang Zhang, and Tiejun Huang. Multi-scale 3d convolution network for video based person re-identiﬁcation. In AAAI,

work page 2019

[18] [18]

Image restoration using very deep convolu- tional encoder-decoder networks with symmetric skip con- nections

[Mao et al., 2016] Xiaojiao Mao, Chunhua Shen, and Yu- Bin Yang. Image restoration using very deep convolu- tional encoder-decoder networks with symmetric skip con- nections. In NIPS, pages 2802–2810,

work page 2016

[19] [19]

U-net: Convolutional networks for biomedical image segmentation

[Ronneberger et al., 2015] Olaf Ronneberger, Philipp Fis- cher, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In MICCAI, pages 234–241. Springer,

work page 2015

[20] [20]

Imagenet large scale visual recogni- tion challenge

[Russakovsky et al., 2015] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. Imagenet large scale visual recogni- tion challenge. International Journal of Computer Vision , 115(3):211–252,

work page 2015

[21] [21]

Image super-resolution via deep recursive residual net- work

[Tai et al., 2017] Ying Tai, Jian Yang, and Xiaoming Liu. Image super-resolution via deep recursive residual net- work. In CVPR, volume 1, page 5,

work page 2017

[22] [22]

Scale-adaptive low- resolution person re-identiﬁcation via learning a discrimi- nating surface

[Wang et al., 2016] Zheng Wang, Ruimin Hu, Yi Yu, Junjun Jiang, Chao Liang, and Jinqiao Wang. Scale-adaptive low- resolution person re-identiﬁcation via learning a discrimi- nating surface. In IJCAI, pages 2669–2675,

work page 2016

[23] [23]

Person transfer gan to bridge domain gap for person re-identiﬁcation

[Wei et al., 2018] Longhui Wei, Shiliang Zhang, Wen Gao, and Qi Tian. Person transfer gan to bridge domain gap for person re-identiﬁcation. In CVPR, pages 79–88,

work page 2018

[24] [24]

Super-resolving very low-resolution face images with supplementary attributes

[Yu et al., 2018] Xin Yu, Basura Fernando, Richard Hartley, and Fatih Porikli. Super-resolving very low-resolution face images with supplementary attributes. In CVPR, pages 908–917,

work page 2018

[25] [25]

Scalable person re-identiﬁcation: A benchmark

[Zheng et al., 2015] Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. Scalable person re-identiﬁcation: A benchmark. In ICCV, pages 1116–1124, 2015

work page 2015