{"paper":{"title":"Diverse via bounded Agreement: Geometric Regularization for Multimodal Fusion","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Regularizing multimodal representation geometry mitigates modality trade-offs.","cross_cats":["cs.LG"],"primary_cat":"cs.CV","authors_text":"Fei Wang, Hao Wang, Pengcheng Weng, William Dan, Yangxin Xu, Yanyu Qian, Zixuan Xia","submitted_at":"2026-01-29T13:03:50Z","abstract_excerpt":"Multimodal fusion is often treated as an optimization-balancing problem, where training signals are adjusted to prevent one modality from dominating the others. However, balanced optimization does not fully determine the geometry of intermediate representations. Supervised multimodal models may still learn low-diversity modality-specific embeddings or allow paired cross-modal observations to drift excessively apart, weakening both unimodal robustness and multimodal fusion.\n  We introduce \\regName, a lightweight plug-and-play geometric regularization framework for multimodal representation lear"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"explicitly regulating representation geometry effectively mitigates modality trade-offs","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the observed intra-modal collapse and cross-modal inconsistency are the primary geometric pathologies limiting performance and that the proposed regularizers can be added without introducing new optimization instabilities or requiring extensive retuning.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"A regularization method enforces diverse intra-modal embeddings and bounded inter-modal drift to improve both multimodal fusion and unimodal robustness.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Regularizing multimodal representation geometry mitigates modality trade-offs.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"673110942e1c21863fc4e66bb126781e8e7f05f2a7e4a748fb35e58e1fca10ec"},"source":{"id":"2601.21670","kind":"arxiv","version":3},"verdict":{"id":"5d0b2548-02de-45df-b0b7-a93d23768cfa","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T09:57:19.642562Z","strongest_claim":"explicitly regulating representation geometry effectively mitigates modality trade-offs","one_line_summary":"A regularization method enforces diverse intra-modal embeddings and bounded inter-modal drift to improve both multimodal fusion and unimodal robustness.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the observed intra-modal collapse and cross-modal inconsistency are the primary geometric pathologies limiting performance and that the proposed regularizers can be added without introducing new optimization instabilities or requiring extensive retuning.","pith_extraction_headline":"Regularizing multimodal representation geometry mitigates modality trade-offs."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2601.21670/integrity.json","findings":[],"available":true,"detectors_run":[],"snapshot_sha256":"c28c3603d3b5d939e8dc4c7e95fa8dfce3d595e45f758748cecf8e644a296938"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"0dd4ea60462a5f704ff14e371b828e5943e9a635311902ef220e19adff913bab"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}