Qwen-image technical report

Chenfei Wu, Jiahao Li, Jingren Zhou, Junyang Lin, Kaiyuan Gao, Kun Yan, Sheng ming Yin, Shuai Bai, Xiao Xu, Yilei Chen, Yuxiang Chen, Zecheng Tang, Zekai Zhang, Zhengyi Wang, An Ya

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

browse 8 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

VOSR: A Vision-Only Generative Model for Image Super-Resolution

cs.CV · 2026-04-03 · conditional · novelty 7.0

VOSR shows that competitive generative image super-resolution with faithful structures can be achieved by training a diffusion-style model from scratch on visual data alone, using a vision encoder for guidance and a restoration-oriented sampling strategy.

ChArtist: Generating Pictorial Charts with Unified Spatial and Subject Control

cs.CV · 2026-03-15 · unverdicted · novelty 7.0

ChArtist generates pictorial charts via a Diffusion Transformer using skeleton-based spatial control and reference-image subject control, supported by a new 30,000-triplet dataset and data accuracy metric.

DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching

cs.CV · 2026-02-05 · unverdicted · novelty 7.0

DisCa replaces heuristic feature caching with a lightweight learnable neural predictor compatible with distillation, achieving 11.8× acceleration on video diffusion transformers with preserved generation quality.

Do-Undo Bench: Reversibility for Action Understanding in Image Generation

cs.CV · 2025-12-15 · unverdicted · novelty 7.0

Do-Undo Bench is a new evaluation task and dataset that forces models to simulate forward action effects and then undo them to measure genuine action understanding in image generation.

MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition

cs.CV · 2025-12-08 · unverdicted · novelty 7.0

MICo-150K is a new 150K-image dataset with 7 tasks, a De&Re real-image subset, MICo-Bench, and Weighted-Ref-VIEScore metric that improves AI models for generating consistent composites from arbitrary numbers of reference images.

HorizonWeaver: Generalizable Multi-Level Semantic Editing for Driving Scenes

cs.CV · 2026-04-06 · unverdicted · novelty 6.0

HorizonWeaver enables photorealistic, instruction-driven multi-level editing of complex driving scenes with improved generalization via a new paired dataset, language-guided masks, and joint training losses.

Scaling Up AI-Generated Image Detection with Generator-Aware Prototypes

cs.CV · 2025-12-15 · unverdicted · novelty 6.0

GAPL learns a compact set of canonical forgery prototypes and applies two-stage LoRA training to build a low-variance feature space that improves generalization across GAN and diffusion generators.

Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling

cs.CV · 2025-12-14 · conditional · novelty 6.0

Scone unifies subject understanding and generation in a two-stage trained model to improve both composition and distinction in multi-subject image generation, outperforming prior open-source models on new benchmarks.

citing papers explorer

Showing 8 of 8 citing papers.

VOSR: A Vision-Only Generative Model for Image Super-Resolution cs.CV · 2026-04-03 · conditional · none · ref 43
VOSR shows that competitive generative image super-resolution with faithful structures can be achieved by training a diffusion-style model from scratch on visual data alone, using a vision encoder for guidance and a restoration-oriented sampling strategy.
ChArtist: Generating Pictorial Charts with Unified Spatial and Subject Control cs.CV · 2026-03-15 · unverdicted · none · ref 47
ChArtist generates pictorial charts via a Diffusion Transformer using skeleton-based spatial control and reference-image subject control, supported by a new 30,000-triplet dataset and data accuracy metric.
DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching cs.CV · 2026-02-05 · unverdicted · none · ref 67
DisCa replaces heuristic feature caching with a lightweight learnable neural predictor compatible with distillation, achieving 11.8× acceleration on video diffusion transformers with preserved generation quality.
Do-Undo Bench: Reversibility for Action Understanding in Image Generation cs.CV · 2025-12-15 · unverdicted · none · ref 33
Do-Undo Bench is a new evaluation task and dataset that forces models to simulate forward action effects and then undo them to measure genuine action understanding in image generation.
MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition cs.CV · 2025-12-08 · unverdicted · none · ref 85
MICo-150K is a new 150K-image dataset with 7 tasks, a De&Re real-image subset, MICo-Bench, and Weighted-Ref-VIEScore metric that improves AI models for generating consistent composites from arbitrary numbers of reference images.
HorizonWeaver: Generalizable Multi-Level Semantic Editing for Driving Scenes cs.CV · 2026-04-06 · unverdicted · none · ref 60
HorizonWeaver enables photorealistic, instruction-driven multi-level editing of complex driving scenes with improved generalization via a new paired dataset, language-guided masks, and joint training losses.
Scaling Up AI-Generated Image Detection with Generator-Aware Prototypes cs.CV · 2025-12-15 · unverdicted · none · ref 69
GAPL learns a compact set of canonical forgery prototypes and applies two-stage LoRA training to build a low-variance feature space that improves generalization across GAN and diffusion generators.
Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling cs.CV · 2025-12-14 · conditional · none · ref 36
Scone unifies subject understanding and generation in a two-stage trained model to improve both composition and distinction in multi-subject image generation, outperforming prior open-source models on new benchmarks.

Qwen-image technical report

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer