COMPASS: Complete Multimodal Fusion via Proxy Tokens and Shared Spaces for Ubiquitous Sensing
read the original abstract
Missing modalities in multimodal sensing cause not only information loss but also a fusion-interface mismatch: a fusion head trained on a canonical set of modality slots must operate on changing observed subsets at inference time. We propose Compass, an interface-complete fusion framework that restores this canonical slot structure before prediction. Each modality is assigned a fixed fusion slot. Observed modalities populate their slots with real representations, while absent modalities are filled with target-slot completion representations estimated from the observed sources. Multiple source-specific estimates for the same missing slot are aggregated into a single slot filler, allowing the same lightweight fusion operator to be applied under arbitrary missing-modality patterns. Training uses synthetic modality masking, slot-compatibility supervision, and representation-space stabilization to make completed slots compatible with real modality representations and useful for downstream recognition. Across XRF55, MM-Fi, and OctoNet, Compass improves robustness under diverse single- and multiple-missing settings, including controlled comparisons against imputation, distillation, and translation-style baselines. These results suggest that preserving the fusion interface is a simple and effective principle for robust multimodal sensing.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Diverse via bounded Agreement: Geometric Regularization for Multimodal Fusion
A regularization method enforces diverse intra-modal embeddings and bounded inter-modal drift to improve both multimodal fusion and unimodal robustness.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.