Anydoor: Zero-shot object-level im- age customization

Xi Chen, Lianghua Huang, Yu Liu, Yujun Shen, Deli Zhao, Hengshuang Zhao · 2023 · arXiv 2307.09481

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

iTryOn: Mastering Interactive Video Virtual Try-On with Spatial-Semantic Guidance

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

iTryOn is a video diffusion Transformer that injects spatial 3D hand guidance and semantic action captions to enable interactive garment replacement in videos.

VACE: All-in-One Video Creation and Editing

cs.CV · 2025-03-10 · unverdicted · novelty 7.0

VACE unifies reference-to-video generation, video-to-video editing, and masked video-to-video editing in one Diffusion Transformer framework using a Video Condition Unit for inputs and a Context Adapter for task injection.

PostureObjectstitch: Anomaly Image Generation Considering Assembly Relationships in Industrial Scenarios

cs.CV · 2026-04-15 · unverdicted · novelty 6.0

PostureObjectStitch generates assembly-aware anomaly images by decoupling multi-view features into high-frequency, texture and RGB components, modulating them temporally in a diffusion model, and applying conditional loss plus geometric priors to preserve correct component relationships.

OmniPrism: Learning Disentangled Visual Concept for Image Generation

cs.CV · 2024-12-16 · unverdicted · novelty 6.0

OmniPrism proposes a disentanglement method using a new paired dataset (PCD-200K), COD contrastive training, and block embeddings to inject separated concepts into diffusion models for multi-aspect image generation.

InstantID: Zero-shot Identity-Preserving Generation in Seconds

cs.CV · 2024-01-15 · unverdicted · novelty 6.0

InstantID enables zero-shot identity-preserving image generation from one facial image via a novel IdentityNet that combines strong semantic and weak spatial conditioning with text prompts in diffusion models.

Common Inpainted Objects In-N-Out of Context

cs.CV · 2025-05-31 · unverdicted · novelty 5.0

COinCO is a new dataset of inpainted COCO images with in- and out-of-context objects, enabling context reasoning, object prediction from scenes, and improved fake image detection.

Wan: Open and Advanced Large-Scale Video Generative Models

cs.CV · 2025-03-26 · unverdicted · novelty 5.0

Wan releases open 1.3B and 14B video diffusion models claiming superior performance over open-source and commercial baselines across multiple tasks with consumer-grade efficiency.

citing papers explorer

Showing 7 of 7 citing papers.

iTryOn: Mastering Interactive Video Virtual Try-On with Spatial-Semantic Guidance cs.CV · 2026-05-20 · unverdicted · none · ref 87
iTryOn is a video diffusion Transformer that injects spatial 3D hand guidance and semantic action captions to enable interactive garment replacement in videos.
VACE: All-in-One Video Creation and Editing cs.CV · 2025-03-10 · unverdicted · none · ref 11
VACE unifies reference-to-video generation, video-to-video editing, and masked video-to-video editing in one Diffusion Transformer framework using a Video Condition Unit for inputs and a Context Adapter for task injection.
PostureObjectstitch: Anomaly Image Generation Considering Assembly Relationships in Industrial Scenarios cs.CV · 2026-04-15 · unverdicted · none · ref 5
PostureObjectStitch generates assembly-aware anomaly images by decoupling multi-view features into high-frequency, texture and RGB components, modulating them temporally in a diffusion model, and applying conditional loss plus geometric priors to preserve correct component relationships.
OmniPrism: Learning Disentangled Visual Concept for Image Generation cs.CV · 2024-12-16 · unverdicted · none · ref 4
OmniPrism proposes a disentanglement method using a new paired dataset (PCD-200K), COD contrastive training, and block embeddings to inject separated concepts into diffusion models for multi-aspect image generation.
InstantID: Zero-shot Identity-Preserving Generation in Seconds cs.CV · 2024-01-15 · unverdicted · none · ref 2
InstantID enables zero-shot identity-preserving image generation from one facial image via a novel IdentityNet that combines strong semantic and weak spatial conditioning with text prompts in diffusion models.
Common Inpainted Objects In-N-Out of Context cs.CV · 2025-05-31 · unverdicted · none · ref 34
COinCO is a new dataset of inpainted COCO images with in- and out-of-context objects, enabling context reasoning, object prediction from scenes, and improved fake image detection.
Wan: Open and Advanced Large-Scale Video Generative Models cs.CV · 2025-03-26 · unverdicted · none · ref 7
Wan releases open 1.3B and 14B video diffusion models claiming superior performance over open-source and commercial baselines across multiple tasks with consumer-grade efficiency.

Anydoor: Zero-shot object-level im- age customization

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer