pith. sign in

hub

Decoupled dmd: Cfg augmentation as the spear, distribution matching as the shield

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

hub tools

citation-role summary

background 2

citation-polarity summary

fields

cs.CV 13 cs.LG 1

years

2026 14

roles

background 2

polarities

background 1 support 1

clear filters

representative citing papers

Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models

cs.CV · 2026-05-20 · unverdicted · novelty 5.0

Lens is a 3.8B-parameter text-to-image model that reaches competitive or superior performance to >6B-parameter systems using 19.3% of the training compute of Z-Image through a densely captioned 800M dataset, multi-resolution batching, semantic VAE, strong language encoder, RL fine-tuning, and 4-step

Qwen-Image-Flash: Beyond Objective Design

cs.CV · 2026-06-02 · unverdicted · novelty 4.0

Empirical analysis of data, guidance, and task mixture in few-step distillation of Qwen-Image-2.0 produces the Qwen-Image-Flash model with improved performance in unified generation and editing tasks.

ERNIE-Image Technical Report

cs.CV · 2026-05-25 · unverdicted · novelty 4.0 · 2 refs

The paper presents ERNIE-Image, an open-source 8B DiT text-to-image model claiming leading open-source performance and near-commercial results via specialized data construction and DPO alignment.

Qwen-Image-2.0 Technical Report

cs.CV · 2026-05-11 · unverdicted · novelty 4.0

Qwen-Image-2.0 unifies high-fidelity image generation and precise editing by coupling Qwen3-VL with a Multimodal Diffusion Transformer, improving text rendering, photorealism, and complex prompt following over prior versions.

citing papers explorer

Showing 14 of 14 citing papers after filters.