ChArtist generates pictorial charts via a Diffusion Transformer using skeleton-based spatial control and reference-image subject control, supported by a new 30,000-triplet dataset and data accuracy metric.
Gligen: Open-set grounded text-to-image generation
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Realiz3D decouples visual domain from 3D controls in diffusion models via domain-aware residual adapters to enable photorealistic controllable generation.
Fine-tuning text-to-video models on sparse low-quality synthetic data for physical camera controls outperforms fine-tuning on photorealistic data.
MedShift applies flow matching and Schrödinger bridges for class-conditional unpaired translation between synthetic and real skull X-rays, benchmarked on the new X-DigiSkull dataset.
citing papers explorer
-
ChArtist: Generating Pictorial Charts with Unified Spatial and Subject Control
ChArtist generates pictorial charts via a Diffusion Transformer using skeleton-based spatial control and reference-image subject control, supported by a new 30,000-triplet dataset and data accuracy metric.
-
Realiz3D: 3D Generation Made Photorealistic via Domain-Aware Learning
Realiz3D decouples visual domain from 3D controls in diffusion models via domain-aware residual adapters to enable photorealistic controllable generation.
-
Less is More: Data-Efficient Adaptation for Controllable Text-to-Video Generation
Fine-tuning text-to-video models on sparse low-quality synthetic data for physical camera controls outperforms fine-tuning on photorealistic data.
-
MedShift: Implicit Conditional Transport for X-Ray Domain Adaptation
MedShift applies flow matching and Schrödinger bridges for class-conditional unpaired translation between synthetic and real skull X-rays, benchmarked on the new X-DigiSkull dataset.