A technique for controllable diversity in text-to-image generation by inducing structured semantic variations at the prompt level via VLM and agentic workflow.
Saedit: Token-level control for continuous image editing via sparse autoencoder.arXiv preprint arXiv:2510.05081, 2025
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
A method that treats 3D box pairs as exact transformation specs, adds a depth-aware floor reference, and trains an image generator on synthetic scenes plus Objectron videos to perform large 3D edits on real photographs.
Token-to-Token alignment rephrases prompts into shared structure then matches token embeddings by semantic similarity, making linear interpolation a meaningful operation for blending in text-to-image models.
citing papers explorer
-
Semantic Browsing: Controllable Diversity for Image Generation
A technique for controllable diversity in text-to-image generation by inducing structured semantic variations at the prompt level via VLM and agentic workflow.
-
Thinking in Boxes: 3D Editing in Real Images Made Easy
A method that treats 3D box pairs as exact transformation specs, adds a depth-aware floor reference, and trains an image generator on synthetic scenes plus Objectron videos to perform large 3D edits on real photographs.
-
Token-to-Token Alignment of Text Embeddings for Semantic Blending
Token-to-Token alignment rephrases prompts into shared structure then matches token embeddings by semantic similarity, making linear interpolation a meaningful operation for blending in text-to-image models.