mplug: Effective and efficient vision-language learning by cross-modal skip-connections

Chenliang Li et al · 2022

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation

cs.CV · 2025-05-08 · unverdicted · novelty 6.0

Mogao presents a causal unified model with deep fusion, dual encoders, and interleaved position embeddings that achieves strong performance on multi-modal understanding, text-to-image generation, and coherent interleaved outputs including zero-shot editing.

citing papers explorer

Showing 1 of 1 citing paper.

Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation cs.CV · 2025-05-08 · unverdicted · none · ref 37
Mogao presents a causal unified model with deep fusion, dual encoders, and interleaved position embeddings that achieves strong performance on multi-modal understanding, text-to-image generation, and coherent interleaved outputs including zero-shot editing.

mplug: Effective and efficient vision-language learning by cross-modal skip-connections

fields

years

verdicts

representative citing papers

citing papers explorer