Show-o: One sin- gle transformer to unify multimodal understanding and generation, 2025

Jinheng Xie, Weijia Mao, Zechen Bai, David Junhao Zhang, Weihao Wang, Kevin Qinghong Lin, Yuchao Gu, Zhijie Chen, Zhenheng Yang, Mike Zheng Shou · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

FullFlow: Upgrading Text-to-Image Flow Matching Models for Bidirectional Vision--Language Generation

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

FullFlow adds LoRA adapters and discrete text insertion to pretrained rectified-flow text-to-image models, achieving bidirectional generation with major gains in FID, CIDEr, VRAM, and throughput over Dual Diffusion baselines.

citing papers explorer

Showing 1 of 1 citing paper.

FullFlow: Upgrading Text-to-Image Flow Matching Models for Bidirectional Vision--Language Generation cs.CV · 2026-05-19 · unverdicted · none · ref 54
FullFlow adds LoRA adapters and discrete text insertion to pretrained rectified-flow text-to-image models, achieving bidirectional generation with major gains in FID, CIDEr, VRAM, and throughput over Dual Diffusion baselines.

Show-o: One sin- gle transformer to unify multimodal understanding and generation, 2025

fields

years

verdicts

representative citing papers

citing papers explorer