Storymaker: Towards holistic consistent characters in text-to-image generation.arXiv preprint arXiv:2409.12576

Zhengguang Zhou, Jing Li, Huaxia Li, Nemo Chen, Xu Tang · 2024 · arXiv 2409.12576

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Can Image Models Imagine Time? ImageTime: A Novel Benchmark for Probing Visual World Modeling Through Spatiotemporal Consistency

cs.CV · 2026-06-09 · unverdicted · novelty 7.0

ImageTime is a benchmark that probes image generation models' visual world modeling by requiring coherent four-state sequences in single images, scored via VLM judge.

FreeStory: Training-Free Character Consistency for Free-Form Visual Storytelling

cs.CV · 2026-06-23 · unverdicted · novelty 6.0

FreeStory reformulates character consistency as entity-grounded feature reuse for free-form prompts, introduces FreeStoryBench, and reports stronger consistency than baselines among training-free methods.

AttriStory: Fine-grained Attribute Realization for Visual Storytelling with Diffusion Models

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

AttriStory adds a benchmark and AttriLoss-based latent optimization to improve faithful rendering of fine-grained attributes such as clothing color and texture in diffusion-model visual storytelling.

DreamShot: Personalized Storyboard Synthesis with Video Diffusion Prior

cs.CV · 2026-04-19 · unverdicted · novelty 6.0

DreamShot uses video diffusion priors and a role-attention consistency loss to produce coherent, personalized storyboards with better character and scene continuity than text-to-image methods.

citing papers explorer

Showing 4 of 4 citing papers.

Can Image Models Imagine Time? ImageTime: A Novel Benchmark for Probing Visual World Modeling Through Spatiotemporal Consistency cs.CV · 2026-06-09 · unverdicted · none · ref 58
ImageTime is a benchmark that probes image generation models' visual world modeling by requiring coherent four-state sequences in single images, scored via VLM judge.
FreeStory: Training-Free Character Consistency for Free-Form Visual Storytelling cs.CV · 2026-06-23 · unverdicted · none · ref 36
FreeStory reformulates character consistency as entity-grounded feature reuse for free-form prompts, introduces FreeStoryBench, and reports stronger consistency than baselines among training-free methods.
AttriStory: Fine-grained Attribute Realization for Visual Storytelling with Diffusion Models cs.CV · 2026-05-20 · unverdicted · none · ref 41
AttriStory adds a benchmark and AttriLoss-based latent optimization to improve faithful rendering of fine-grained attributes such as clothing color and texture in diffusion-model visual storytelling.
DreamShot: Personalized Storyboard Synthesis with Video Diffusion Prior cs.CV · 2026-04-19 · unverdicted · none · ref 59
DreamShot uses video diffusion priors and a role-attention consistency loss to produce coherent, personalized storyboards with better character and scene continuity than text-to-image methods.

Storymaker: Towards holistic consistent characters in text-to-image generation.arXiv preprint arXiv:2409.12576

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer