ParetoSlider conditions diffusion models on continuous preference weights to approximate the full Pareto front, providing dynamic control over multi-objective rewards at inference time.
hub
Flux.1 kontext: Flow matching for in-context image generation and editing in latent space
12 Pith papers cite this work. Polarity classification is still indexing.
hub tools
representative citing papers
ChArtist generates pictorial charts via a Diffusion Transformer using skeleton-based spatial control and reference-image subject control, supported by a new 30,000-triplet dataset and data accuracy metric.
LooseRoPE modulates RoPE in diffusion attention maps to continuously trade off between preserving a pasted object's identity and harmonizing it with its new surroundings.
Do-Undo Bench is a new evaluation task and dataset that forces models to simulate forward action effects and then undo them to measure genuine action understanding in image generation.
MICo-150K is a new 150K-image dataset with 7 tasks, a De&Re real-image subset, MICo-Bench, and Weighted-Ref-VIEScore metric that improves AI models for generating consistent composites from arbitrary numbers of reference images.
The paper introduces a framework of four complementary analyses to evaluate the faithfulness of synthetic concept images from zero-shot T2I models versus real images for concept-based XAI.
Stepper uses stepwise panoramic expansion with a multi-view 360-degree diffusion model and geometry reconstruction to produce high-fidelity, structurally consistent immersive 3D scenes from text.
RenderFlow replaces iterative diffusion with flow matching for deterministic single-step neural rendering that achieves near real-time photorealistic quality and extends to inverse rendering via an adapter module.
The method warps pixels inside object boundaries with Snell's Law during generation and synchronizes with a second panorama image to produce optically plausible refraction in text-to-image outputs.
SkyReels-Text enables simultaneous fine-grained editing of multiple text regions in posters using arbitrary glyph patches for font control without labels or test-time fine-tuning.
GrOCE uses dynamic semantic graphs for online, training-free erasure of target concepts from diffusion model prompts via cluster identification and selective severing.