WorldComp2D explicitly structures latent space geometry by object identity and spatial proximity via a proximity-dependent encoder and localizer, cutting parameters up to 4X and FLOPs 2.2X versus state-of-the-art lightweight models on facial landmark localization while staying real-time on CPU.
Computer Vision--ECCV 2020: 16th European Conference, Glasgow, UK, August 23--28
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
CmIR uses causal inference to separate invariant causal representations from spurious ones in multimodal data, improving generalization under distribution shifts and noise via invariance, mutual information, and reconstruction constraints.
Representations learned by large AI models are converging toward a shared statistical model of reality.
citing papers explorer
-
WorldComp2D: Spatio-semantic Representations of Object Identity and Location from Local Views
WorldComp2D explicitly structures latent space geometry by object identity and spatial proximity via a proximity-dependent encoder and localizer, cutting parameters up to 4X and FLOPs 2.2X versus state-of-the-art lightweight models on facial landmark localization while staying real-time on CPU.
-
Learning Invariant Modality Representation for Robust Multimodal Learning from a Causal Inference Perspective
CmIR uses causal inference to separate invariant causal representations from spurious ones in multimodal data, improving generalization under distribution shifts and noise via invariance, mutual information, and reconstruction constraints.
-
The Platonic Representation Hypothesis
Representations learned by large AI models are converging toward a shared statistical model of reality.