In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Zhu, H · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Divide and Conquer: Decoupled Representation Alignment for Multimodal World Models

cs.CV · 2026-05-03 · unverdicted · novelty 7.0

M²-REPA decouples modality-specific features from diffusion intermediates and aligns them to complementary expert foundation models via a multi-modal alignment loss and modality-specific decoupling regularization for improved multimodal video generation.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Divide and Conquer: Decoupled Representation Alignment for Multimodal World Models cs.CV · 2026-05-03 · unverdicted · none · ref 65
M²-REPA decouples modality-specific features from diffusion intermediates and aligns them to complementary expert foundation models via a multi-modal alignment loss and modality-specific decoupling regularization for improved multimodal video generation.

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

fields

years

verdicts

representative citing papers

citing papers explorer