Pseudo-Unification: Entropy Probing Reveals Divergent Information Patterns in Unified Multimodal Models

· 2026 · cs.CV · arXiv 2604.10949

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Unified multimodal models (UMMs) were designed to combine the reasoning ability of large language models (LLMs) with the generation capability of vision models. In practice, however, this synergy remains elusive: UMMs fail to transfer LLM-like reasoning to image synthesis and exhibit divergent response behaviors. We term this phenomenon pseudo-unification. Diagnosing its internal causes is important, but existing probing methods either lack model-internal insight or ignore prompt-response dependencies. To address these limitations, we propose an information-theoretic probing framework that jointly analyzes how UMMs encode inputs and generate outputs. Applied to ten representative UMMs, our framework reveals that pseudo-unification stems from a dual divergence: (i) Modality-Asymmetric Encoding, where vision and language follow different entropy trajectories, and (ii) Pattern-Split Response, where text generation exhibits high-entropy creativity while image synthesis enforces low-entropy fidelity. Only models that unify both sides (e.g., via contextual prediction) achieve more genuine unification, enabling stronger reasoning-based text-to-image generation even with fewer parameters. Our work provides the first model-internal probing of unification, demonstrating that real multimodal synergy requires consistency in information flow, not just shared parameters.

representative citing papers

LatentUMM: Dual Latent Alignment for Unified Multimodal Models

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

LatentUMM proposes dual latent alignment at modality and capacity levels plus latent dynamics stabilization to reduce semantic drift and improve consistency in unified multimodal models.

citing papers explorer

Showing 1 of 1 citing paper.

LatentUMM: Dual Latent Alignment for Unified Multimodal Models cs.CV · 2026-05-18 · unverdicted · none · ref 51 · internal anchor
LatentUMM proposes dual latent alignment at modality and capacity levels plus latent dynamics stabilization to reduce semantic drift and improve consistency in unified multimodal models.

Pseudo-Unification: Entropy Probing Reveals Divergent Information Patterns in Unified Multimodal Models

fields

years

verdicts

representative citing papers

citing papers explorer