pith. sign in

Liquid: Language models are scalable multi-modal generators

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

citation-role summary

baseline 2 background 1

citation-polarity summary

fields

cs.CV 6

years

2026 2 2025 4

verdicts

UNVERDICTED 6

representative citing papers

Mogao: An Omni Foundation Model for Interleaved Multi-Modal Generation

cs.CV · 2025-05-08 · unverdicted · novelty 6.0

Mogao presents a causal unified model with deep fusion, dual encoders, and interleaved position embeddings that achieves strong performance on multi-modal understanding, text-to-image generation, and coherent interleaved outputs including zero-shot editing.

Emerging Properties in Unified Multimodal Pretraining

cs.CV · 2025-05-20 · unverdicted · novelty 5.0

BAGEL is a unified decoder-only model that develops emerging complex multimodal reasoning abilities after pretraining on large-scale interleaved data and outperforms prior open-source unified models.

Show-o2: Improved Native Unified Multimodal Models

cs.CV · 2025-06-18 · unverdicted · novelty 4.0

Show-o2 unifies text, image, and video understanding and generation in a single autoregressive-plus-flow-matching model built on 3D causal VAE representations.

citing papers explorer

Showing 6 of 6 citing papers.