STFER uses LVLM-generated identity-consistent semantic text to drive visual token filtering and expert routing for improved any-time person re-identification under clothing changes and modality shifts.
Self-supervised learning of whole and component-based semantic representations for person re-identification
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Transformer-based ReID embeddings encode BMI most strongly in deeper layers, followed by pitch, gender, and yaw, with pose peaking in middle layers and BMI increasing with depth; cross-spectral settings shift reliance toward structural cues.
citing papers explorer
-
Beyond Visual Cues: Semantic-Driven Token Filtering and Expert Routing for Anytime Person ReID
STFER uses LVLM-generated identity-consistent semantic text to drive visual token filtering and expert routing for improved any-time person re-identification under clothing changes and modality shifts.
-
AttriBE: Quantifying Attribute Expressivity in Body Embeddings for Recognition and Identification
Transformer-based ReID embeddings encode BMI most strongly in deeper layers, followed by pitch, gender, and yaw, with pose peaking in middle layers and BMI increasing with depth; cross-spectral settings shift reliance toward structural cues.