Recognition: unknown
Language Guided Fashion Image Manipulation with Feature-wise Transformations
read the original abstract
Developing techniques for editing an outfit image through natural sentences and accordingly generating new outfits has promising applications for art, fashion and design. However, it is considered as a certainly challenging task since image manipulation should be carried out only on the relevant parts of the image while keeping the remaining sections untouched. Moreover, this manipulation process should generate an image that is as realistic as possible. In this work, we propose FiLMedGAN, which leverages feature-wise linear modulation (FiLM) to relate and transform visual features with natural language representations without using extra spatial information. Our experiments demonstrate that this approach, when combined with skip connections and total variation regularization, produces more plausible results than the baseline work, and has a better localization capability when generating new outfits consistent with the target description.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.