A new data pipeline using real photos, entity removal, and image-to-video models plus a cross-view attention loss enables text-driven generation of actors in reference scenes with improved alignment.
Seed-story: Multi- modal long story generation with large language model
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2representative citing papers
AttriStory adds a benchmark and AttriLoss-based latent optimization to improve faithful rendering of fine-grained attributes such as clothing color and texture in diffusion-model visual storytelling.
citing papers explorer
-
Setting the Stage: Text-Driven Scene-Consistent Image Generation
A new data pipeline using real photos, entity removal, and image-to-video models plus a cross-view attention loss enables text-driven generation of actors in reference scenes with improved alignment.
-
AttriStory: Fine-grained Attribute Realization for Visual Storytelling with Diffusion Models
AttriStory adds a benchmark and AttriLoss-based latent optimization to improve faithful rendering of fine-grained attributes such as clothing color and texture in diffusion-model visual storytelling.