pith. sign in

Towards uniformity and alignment for multimodal representation learning.arXiv preprint arXiv:2602.09507

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.CV 1 cs.RO 1

years

2026 2

clear filters

representative citing papers

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation cs.RO · 2026-05-28 · unverdicted · none · ref 55

    DynaFLIP pre-trains dynamics-aware image encoders by aligning image, language, and 3D flow modalities through simplex-volume minimization plus regularizers on video triplets, yielding reusable backbones that improve manipulation policies by up to 22.5% in out-of-distribution settings.