Fashion-Gen: The Generative Fashion Dataset and Challenge

Chris Pal; Christian Jauvin; Negar Rostamzadeh; Seyedarian Hosseini; Thomas Boquet; Wojciech Stokowiec; Ying Zhang

arxiv: 1806.08317 · v2 · pith:5ARYHABKnew · submitted 2018-06-21 · 📊 stat.ML · cs.LG

Fashion-Gen: The Generative Fashion Dataset and Challenge

Negar Rostamzadeh , Seyedarian Hosseini , Thomas Boquet , Wojciech Stokowiec , Ying Zhang , Christian Jauvin , Chris Pal This is my paper

classification 📊 stat.ML cs.LG

keywords datasetchallengedescriptionsfashiongenerationimageitemupon

0 comments

read the original abstract

We introduce a new dataset of 293,008 high definition (1360 x 1360 pixels) fashion images paired with item descriptions provided by professional stylists. Each item is photographed from a variety of angles. We provide baseline results on 1) high-resolution image generation, and 2) image generation conditioned on the given text descriptions. We invite the community to improve upon these baselines. In this paper, we also outline the details of a challenge that we are launching based upon this dataset.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

FashionMV: Product-Level Composed Image Retrieval with Multi-View Fashion Data
cs.CV 2026-04 unverdicted novelty 8.0

FashionMV introduces product-level multi-view CIR, a 127K-product dataset built via automated LMM pipeline, and a 0.8B ProCIR model that beats larger baselines on three fashion benchmarks.
FashionLens: Toward Versatile Fashion Image Retrieval via Task-Adaptive Learning
cs.CV 2026-05 unverdicted novelty 7.0

FashionLens is a task-adaptive MLLM framework that achieves SOTA performance on diverse fashion image retrieval scenarios via spherical query calibration and gradient-guided sampling.
Uncovering the Latent Potential of Deep Intermediate Representations
cs.LG 2026-05 unverdicted novelty 6.0

Introduces LOES, a constructive spectral method to select task-discriminative subspaces from intermediate layer embeddings, and GeoReg for enforcing simplicial class geometry during fine-tuning, with reported gains in...
Topology-Aware Representation Alignment for Semi-Supervised Vision-Language Learning
cs.CV 2026-04 unverdicted novelty 6.0

ToMA uses persistent homology on H0-death and lightweight H1-birth edges to align multimodal manifolds, delivering stable gains on remote sensing and consistent benefits on fashion retrieval.
ReflectCAP: Detailed Image Captioning with Reflective Memory
cs.AI 2026-04 unverdicted novelty 6.0

ReflectCAP distills model-specific hallucination and oversight patterns into Structured Reflection Notes that steer LVLMs toward more factual and complete image captions, reaching the Pareto frontier on factuality-cov...
FashionStylist: An Expert Knowledge-enhanced Multimodal Dataset for Fashion Understanding
cs.CV 2026-04 unverdicted novelty 6.0

FashionStylist is an expert-annotated benchmark dataset that unifies outfit-to-item grounding, completion, and evaluation tasks for multimodal large language models in fashion.