InfoTok uses mutual information constraints to regularize shared visual tokenization in unified MLLMs, improving both understanding and generation performance without extra training data.
Diffusion-4k: Ultra- high-resolution image synthesis with latent diffusion models,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
InfoTok: Information-Theoretic Regularization for Capacity-Constrained Shared Visual Tokenization in Unified MLLMs
InfoTok uses mutual information constraints to regularize shared visual tokenization in unified MLLMs, improving both understanding and generation performance without extra training data.