Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation

Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, Kfir Aberman · 2023

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

browse 7 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

When LoRA Betrays: Backdooring Text-to-Image Models by Masquerading as Benign Adapters

cs.CV · 2026-02-25 · conditional · novelty 8.0

MasqLoRA shows that an independent LoRA adapter can be trained on a few trigger-target pairs to backdoor diffusion models with 99.8% success rate while remaining stealthy when the trigger is absent.

AttriStory: Fine-grained Attribute Realization for Visual Storytelling with Diffusion Models

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

AttriStory adds a benchmark and AttriLoss-based latent optimization to improve faithful rendering of fine-grained attributes such as clothing color and texture in diffusion-model visual storytelling.

Fashion130K: An E-commerce Fashion Dataset for Outfit Generation with Unified Multi-modal Condition

cs.CV · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

Fashion130K dataset and UMC framework align text and visual prompts to generate more consistent fashion outfits than prior state-of-the-art methods.

Gate-and-Merge: Zero-shot Compositional Personalization of Vision Language Models

cs.CV · 2026-05-09 · unverdicted · novelty 6.0

Gate-and-Merge enables zero-shot compositional personalization of VLMs by independently learning concept-specific LoRA adapters and merging them in weight space with cue-based gating to suppress interference.

RealDiffusion: Physics-informed Attention for Multi-character Storybook Generation

cs.CV · 2026-05-12 · unverdicted · novelty 5.0

RealDiffusion uses heat diffusion as a dissipative prior and a region-aware stochastic process inside a training-free physics-informed attention mechanism to improve multi-character coherence while preserving narrative dynamism in sequential image generation.

3D Smoke Scene Reconstruction Guided by Vision Priors from Multimodal Large Language Models

cs.CV · 2026-04-07 · unverdicted · novelty 5.0

A framework that combines MLLM-based image enhancement with a medium-aware 3D Gaussian Splatting model to reconstruct and render smoke scenes.

ID-Sim: An Identity-Focused Similarity Metric

cs.CV · 2026-04-06 · unverdicted · novelty 5.0

ID-Sim is a new similarity metric that aims to capture human selective sensitivity to identities by training on curated real and generative synthetic data and validating against human annotations on recognition, retrieval, and generative tasks.

citing papers explorer

Showing 7 of 7 citing papers.

When LoRA Betrays: Backdooring Text-to-Image Models by Masquerading as Benign Adapters cs.CV · 2026-02-25 · conditional · none · ref 32
MasqLoRA shows that an independent LoRA adapter can be trained on a few trigger-target pairs to backdoor diffusion models with 99.8% success rate while remaining stealthy when the trigger is absent.
AttriStory: Fine-grained Attribute Realization for Visual Storytelling with Diffusion Models cs.CV · 2026-05-20 · unverdicted · none · ref 28
AttriStory adds a benchmark and AttriLoss-based latent optimization to improve faithful rendering of fine-grained attributes such as clothing color and texture in diffusion-model visual storytelling.
Fashion130K: An E-commerce Fashion Dataset for Outfit Generation with Unified Multi-modal Condition cs.CV · 2026-05-11 · unverdicted · none · ref 42 · 2 links
Fashion130K dataset and UMC framework align text and visual prompts to generate more consistent fashion outfits than prior state-of-the-art methods.
Gate-and-Merge: Zero-shot Compositional Personalization of Vision Language Models cs.CV · 2026-05-09 · unverdicted · none · ref 13
Gate-and-Merge enables zero-shot compositional personalization of VLMs by independently learning concept-specific LoRA adapters and merging them in weight space with cue-based gating to suppress interference.
RealDiffusion: Physics-informed Attention for Multi-character Storybook Generation cs.CV · 2026-05-12 · unverdicted · none · ref 26
RealDiffusion uses heat diffusion as a dissipative prior and a region-aware stochastic process inside a training-free physics-informed attention mechanism to improve multi-character coherence while preserving narrative dynamism in sequential image generation.
3D Smoke Scene Reconstruction Guided by Vision Priors from Multimodal Large Language Models cs.CV · 2026-04-07 · unverdicted · none · ref 42
A framework that combines MLLM-based image enhancement with a medium-aware 3D Gaussian Splatting model to reconstruct and render smoke scenes.
ID-Sim: An Identity-Focused Similarity Metric cs.CV · 2026-04-06 · unverdicted · none · ref 59
ID-Sim is a new similarity metric that aims to capture human selective sensitivity to identities by training on curated real and generative synthetic data and validating against human annotations on recognition, retrieval, and generative tasks.

Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer