Latent World Recovery for Multimodal Learning with Missing Modalities

Christopher Baker; Hui Wang; Joseph Butler; Karen Rafferty; Simon McDade; Tianyu Ren

arxiv: 2606.12362 · v1 · pith:FTRB37N3new · submitted 2026-06-10 · 💻 cs.LG · cs.AI

Latent World Recovery for Multimodal Learning with Missing Modalities

Hui Wang , Tianyu Ren , Joseph Butler , Christopher Baker , Karen Rafferty , Simon McDade This is my paper

Pith reviewed 2026-06-27 10:28 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords multimodal learningmissing modalitieslatent space alignmentmulti-omicscancer classificationsurvival predictionavailability-aware fusion

0 comments

The pith

Multimodal prediction proceeds by aligning observed modality embeddings in a shared latent space and fusing only those available, without imputing the missing ones.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a Latent World Recovery approach for multimodal learning when some modalities are absent. It aligns modality-specific embeddings into one latent space using neighbor information and builds a unified representation solely from the modalities present at each sample. This treats each observed modality as a partial view of an underlying state rather than requiring a complete or imputed set. The method is tested on incomplete multi-omics data for cancer phenotype classification and survival prediction, showing that direct fusion of available embeddings suffices for downstream tasks.

Core claim

Latent World Recovery recovers a usable representation from partial modality sets by first aligning each modality's embeddings to a common latent space via neighbor-based matching and then performing availability-aware fusion on only the observed embeddings, thereby supporting robust prediction without explicit reconstruction of absent modalities.

What carries the argument

Neighbor-based latent alignment of modality embeddings combined with availability-aware fusion that operates exclusively on observed modalities treated as partial perceptions of a shared latent state.

If this is right

Training and inference become possible with any non-empty subset of modalities rather than a fixed complete set.
Error accumulation from generating synthetic values for missing modalities is avoided.
The same learned latent space supports multiple downstream tasks such as classification and survival analysis on real incomplete multi-omics collections.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The alignment strategy could extend to other sensor or data streams where individual channels drop out unpredictably.
If neighbor matching proves stable, the method may reduce the need for modality-specific generative models in partially observed settings.
Dynamic missingness patterns during deployment could be handled by re-using the same availability-aware fusion step without retraining.

Load-bearing premise

Embeddings produced by different modalities can be aligned into one consistent latent space even when some modalities are missing for many samples.

What would settle it

On the same incomplete multi-omics benchmarks, an imputation-based baseline or a complete-case baseline would need to match or exceed LWR accuracy for the claim of advantage without reconstruction to fail.

Figures

Figures reproduced from arXiv: 2606.12362 by Christopher Baker, Hui Wang, Joseph Butler, Karen Rafferty, Simon McDade, Tianyu Ren.

**Figure 2.** Figure 2: Case study of clustering-based survival stratification on [PITH_FULL_IMAGE:figures/full_fig_p019_2.png] view at source ↗

**Figure 3.** Figure 3: Biological and clinical characterization of LWR survival-associated clusters [PITH_FULL_IMAGE:figures/full_fig_p021_3.png] view at source ↗

read the original abstract

We study multimodal learning under missing modalities, with particular motivation from bioscience applications in which heterogeneous modalities are often only partially available when decisions need to be made. We propose Latent World Recovery (LWR), a framework built on two key ideas: (i) modality-specific embeddings from different modalities are aligned in a shared latent space, and (ii) a unified representation is constructed by fusing only the embeddings of the modalities that are actually available at both training and inference time. Rather than imputing missing modalities or requiring a fixed modality set, LWR treats each modality as a partial perception of an underlying latent state and performs availability-aware representation learning directly from the observed modalities. This combination of neighbor-based latent alignment and availability-aware modality fusion enables robust multimodal prediction under partial observation, while avoiding error propagation from explicit reconstruction of missing modalities. We evaluate the proposed framework on real-world incomplete multi-omics benchmarks and demonstrate that it provides an effective approach to downstream tasks such as cancer phenotype classification and survival prediction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LWR gives a clean way to skip imputation by aligning available modality embeddings in latent space and fusing on the fly, but the abstract leaves the actual gains and implementation details unclear.

read the letter

The core idea here is neighbor-based latent alignment plus availability-aware fusion so the model never tries to reconstruct missing modalities. That design choice is sensible for bioscience multi-omics where modalities drop out unpredictably.

What the paper actually does is treat each modality as a partial view of a shared latent state and build the representation only from observed embeddings at both train and test time. The evaluation uses real incomplete multi-omics benchmarks for cancer phenotype classification and survival prediction, which matches the stated motivation.

The main limitation is that the abstract supplies no numbers, no baseline comparisons, and no ablation on the alignment step, so it is impossible to tell whether the method moves the needle over existing partial-observation techniques or simply re-packages them. The claim that this avoids error propagation is presented as a property of the design rather than something measured.

If the full experiments show consistent gains on the benchmarks without extra hyperparameters or heavy tuning, the work is worth a look for groups that routinely deal with patchy omics data. Otherwise it risks being another latent-space wrapper that works when modalities are already reasonably aligned.

I would bring this to a reading group to see the actual results and code. It is not a paper I would cite on its own unless the gains are large and reproducible. A serious editor should send it out for review; the problem is real and the framing is straightforward, even if the evidence bar is still low based on what is visible so far.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes Latent World Recovery (LWR), a framework for multimodal learning with missing modalities motivated by bioscience applications. It aligns modality-specific embeddings in a shared latent space via neighbor-based alignment and constructs unified representations by fusing only the embeddings of modalities available at training and inference time. Modalities are treated as partial observations of an underlying latent state, avoiding explicit imputation or fixed modality sets. The approach is claimed to enable robust prediction under partial observation without error propagation from reconstruction, and is evaluated on incomplete multi-omics benchmarks for cancer phenotype classification and survival prediction.

Significance. If supported by rigorous derivations and experiments, the framework could offer a practical alternative to imputation-based multimodal methods in domains with frequent missing data. The design choice to fuse only observed modalities directly is a reasonable way to sidestep reconstruction errors, and the neighbor-based alignment may provide a scalable way to achieve the shared latent space. However, the significance cannot be determined from the provided text alone, as no derivations, ablations, or quantitative results are visible.

major comments (1)

The manuscript consists solely of an abstract that describes the method and claims effectiveness on benchmarks but provides no mathematical derivations, experimental details, results, tables, or comparisons. This absence makes it impossible to verify whether the central claims (robust prediction under partial observation, avoidance of error propagation) are supported by the math or data.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their review. We address the single major comment below.

read point-by-point responses

Referee: The manuscript consists solely of an abstract that describes the method and claims effectiveness on benchmarks but provides no mathematical derivations, experimental details, results, tables, or comparisons. This absence makes it impossible to verify whether the central claims (robust prediction under partial observation, avoidance of error propagation) are supported by the math or data.

Authors: We agree that the version under review contains only the abstract and lacks the mathematical derivations for the neighbor-based alignment and availability-aware fusion, the experimental protocols, quantitative results, tables, and baseline comparisons. This omission prevents verification of the claims. We will revise the manuscript to include the full technical content, derivations, and benchmark evaluations on the incomplete multi-omics datasets for cancer phenotype classification and survival prediction. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper proposes the LWR framework as a design choice consisting of neighbor-based latent alignment of modality embeddings into a shared space plus availability-aware fusion of only observed modalities. No equations, derivations, fitted parameters, or predictions are described in the abstract or claims that reduce by construction to the inputs. The method is presented as an empirical approach evaluated on external benchmarks rather than a self-referential theorem or renamed known result. The central claims remain independent of any self-citation chain or definitional loop.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The framework relies on the assumption of an alignable latent space and the effectiveness of direct fusion from observed modalities.

axioms (2)

domain assumption Modality-specific embeddings can be aligned in a shared latent space using neighbor-based methods.
This is stated as key idea (i) in the abstract.
domain assumption Fusing only available modality embeddings enables robust prediction without error propagation from imputation.
This is the core of the availability-aware fusion idea.

invented entities (1)

Latent World no independent evidence
purpose: Underlying latent state that each modality partially perceives.
Introduced as the conceptual basis for the framework.

pith-pipeline@v0.9.1-grok · 5708 in / 1222 out tokens · 37941 ms · 2026-06-27T10:28:22.143760+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 1 linked inside Pith

[1]

Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text.arXiv preprint arXiv:2104.11178, 2021

Hassan Akbari, Liangzhe Yuan, Rui Qian, Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, and Boqing Gong. Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text.arXiv preprint arXiv:2104.11178, 2021

arXiv 2021
[2]

Deep canonical cor- relation analysis

Galen Andrew, Raman Arora, Jeff Bilmes, and Karen Livescu. Deep canonical cor- relation analysis. InProceedings of the 30th International Conference on Machine Learning, volume 28 ofProceedings of Machine Learning Research, pages 1247– 1255, 2013. 22

2013
[3]

J. L. Ballard, Z. Dai, L. Shen, and Q. Long. Jasmine: A powerful representation learning method for enhanced analysis of incomplete multi-omics data.bioRxiv, pages 2025–06, 2025

2025
[4]

Multimodal ma- chine learning: A survey and taxonomy.IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2):423–443, 2019

Tadas Baltru ˇsaitis, Chaitanya Ahuja, and Louis-Philippe Morency. Multimodal ma- chine learning: A survey and taxonomy.IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2):423–443, 2019

2019
[5]

Crossattomics: multiomics data integration with cross-attention.Bioinformatics, 41(6):btaf302, 2025

Aur ´elien Beaude, Franck Aug´e, Farida Zehraoui, and Blaise Hanczar. Crossattomics: multiomics data integration with cross-attention.Bioinformatics, 41(6):btaf302, 2025

2025
[6]

Machine-learning-based late fu- sion on multi-omics and multi-scale data for non-small-cell lung cancer diagnosis

Francisco Carrillo-Perez, Juan Carlos Morales, Daniel Castillo-Secilla, Olivier Gevaert, Ignacio Rojas, and Luis Javier Herrera. Machine-learning-based late fu- sion on multi-omics and multi-scale data for non-small-cell lung cancer diagnosis. Journal of Personalized Medicine, 12(4):601, 2022

2022
[7]

Xgboost: A scalable tree boosting system

Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. InPro- ceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pages 785–794. ACM, August 2016

2016
[8]

Bailey, Eduardo Porta-Pardo, Vesteinn Thorsson, Antonio Co- laprico, Denis Bertrand, et al

Li Ding, Matthew H. Bailey, Eduardo Porta-Pardo, Vesteinn Thorsson, Antonio Co- laprico, Denis Bertrand, et al. Perspective on oncogenic processes at the end of the beginning of cancer genomics.Cell, 173(2):305–320, 2018

2018
[9]

Learning factored represen- tations in a deep mixture of experts.arXiv preprint arXiv:1312.4314, 2014

David Eigen, Marc’ Aurelio Ranzato, and Ilya Sutskever. Learning factored represen- tations in a deep mixture of experts.arXiv preprint arXiv:1312.4314, 2014

Pith/arXiv arXiv 2014
[10]

Huang, Judit Jan ´e-Valbuena, Gregory V

Mahmoud Ghandi, Franklin W. Huang, Judit Jan ´e-Valbuena, Gregory V. Kryukov, Candy C. Lo, E. Robert McDonald, Jordi Barretina, Ellen T. Gelfand, Craig M. Biel- ski, Hao Li, Kevin Hu, Alexander Y. Andreev-Drakhlin, Jin Seok Kim, Julian M. Hess, Brian J. Haas, Francois Aguet, Barbara A. Weir, Michael V. Rothberg, Benjamin R. Paolella, Michael S. Lawrence, ...

2019
[11]

node2vec: Scalable feature learning for networks

Aditya Grover and Jure Leskovec. node2vec: Scalable feature learning for networks. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 855–864, 2016. 23

2016
[12]

Harrell Jr, Kerry L

Frank E. Harrell Jr, Kerry L. Lee, and Daniel B. Mark. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.Statistics in Medicine, 15(4):361–387, 1996

1996
[13]

Masked autoencoders are scalable vision learners

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Doll ´ar, and Ross Girshick. Masked autoencoders are scalable vision learners. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16000–16009, 2022

2022
[14]

Relations between two sets of variates.Biometrika, 28(3/4):321– 377, 1936

Harold Hotelling. Relations between two sets of variates.Biometrika, 28(3/4):321– 377, 1936

1936
[15]

Kingma and Max Welling

Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. InInterna- tional Conference on Learning Representations, 2014

2014
[16]

Weijia Li, Qiao Huang, Yi Peng, Suyue Pan, Min Hu, Pu Wang, and Yuqing He. A deep learning approach based on multi-omics data integration to construct a risk stratification prediction model for skin cutaneous melanoma.Journal of Cancer Research and Clinical Oncology, 149(17):15923–15938, 2023

2023
[17]

A survey of multi-view representation learning.IEEE Transactions on Knowledge and Data Engineering, 31(10):1863–1883, 2019

Yingming Li, Ming Yang, and Zhongfei Zhang. A survey of multi-view representation learning.IEEE Transactions on Knowledge and Data Engineering, 31(10):1863–1883, 2019

2019
[18]

Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, and Ed H. Chi. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. InPro- ceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1930–1939, 2018

1930
[19]

Moving towards genome-wide data integration for patient stratification with integrate any omics.Nature Machine Intelligence, 7(1):29–42, 2025

Shihao Ma, Andy GX Zeng, Benjamin Haibe-Kains, Anna Goldenberg, John E Dick, and Bo Wang. Moving towards genome-wide data integration for patient stratification with integrate any omics.Nature Machine Intelligence, 7(1):29–42, 2025

2025
[20]

Comprehensive molecular portraits of human tumours

Cancer Genome Atlas Network. Comprehensive molecular portraits of human tumours. Nature, 2012

2012
[21]

Weinstein, Eric A

The Cancer Genome Atlas Research Network, John N. Weinstein, Eric A. Collisson, Gordon B. Mills, Kenna R. Mills Shaw, Brad A. Ozenberger, Kyle Ellrott, Ilya Shmule- vich, Chris Sander, and Joshua M. Stuart. The cancer genome atlas pan-cancer analysis project.Nature Genetics, 45:1113–1120, 2013

2013
[22]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sand- hini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. InProceedings of the 38th International Conference on Machine Learn- ing, volume 139 ofProceedings...

2021
[23]

Siddharth, Brooks Paige, and Philip H

Yuge Shi, N. Siddharth, Brooks Paige, and Philip H. S. Torr. Variational mixture-of- experts autoencoders for multi-modal deep generative models. InAdvances in Neural Information Processing Systems, volume 32, 2019

2019
[24]

Friedman, Trevor Hastie, and Robert Tibshirani

Noah Simon, Jerome H. Friedman, Trevor Hastie, and Robert Tibshirani. Regular- ization paths for cox’s proportional hazards model via coordinate descent.Journal of Statistical Software, 39:1–13, 2011

2011
[25]

Multimodal deep learning for biomedical data fusion: a review.Briefings in bioinformatics, 23(2):bbab569, 2022

S ¨oren Richard Stahlschmidt, Benjamin Ulfenborg, and Jane Synnergren. Multimodal deep learning for biomedical data fusion: a review.Briefings in bioinformatics, 23(2):bbab569, 2022

2022
[26]

C. X. Sun, P. Daniel, G. Bradshaw, H. Shi, M. Loi, N. Chew, S. Parackal, V. Tsui, Y. Liang, M. Koptyra, et al. Generation and multi-dimensional profiling of a childhood cancer cell line atlas defines new therapeutic opportunities.Cancer Cell, 41(4):660– 677, 2023

2023
[27]

Mezlini, Feyyaz Demir, Marc Fiume, Zhuowen Tu, Michael Brudno, Benjamin Haibe-Kains, and Anna Goldenberg

Bo Wang, Aziz M. Mezlini, Feyyaz Demir, Marc Fiume, Zhuowen Tu, Michael Brudno, Benjamin Haibe-Kains, and Anna Goldenberg. Similarity network fusion for aggre- gating data types on a genomic scale.Nature Methods, 11(3):333–337, 2014

2014
[28]

Multimodal generative models for scalable weakly- supervised learning

Mike Wu and Noah Goodman. Multimodal generative models for scalable weakly- supervised learning. InAdvances in Neural Information Processing Systems, vol- ume 31, 2018

2018
[29]

Mind: Multimodal integration with neighbourhood-aware distributions.bioRxiv, 2025

Hanwen Xing and Christopher Yau. Mind: Multimodal integration with neighbourhood-aware distributions.bioRxiv, 2025

2025
[30]

H. Xu, L. Gao, M. Huang, and R. Duan. A network embedding based method for partial multi-omics integration in cancer subtyping.Methods, 192:67–76, 2021

2021
[31]

Omiembed: A unified multi-task deep learning framework for multi-omics data.Cancers, 13(12):3047, 2021

Xiaoyu Zhang, Yuting Xing, Kai Sun, and Yike Guo. Omiembed: A unified multi-task deep learning framework for multi-omics data.Cancers, 13(12):3047, 2021

2021
[32]

Clclsa: Cross-omics linked embedding with contrastive learning and self attention for multi-omics integration with incomplete multi-omics data.arXiv preprint arXiv:2304.05542, 2023

Chen Zhao, Anqi Liu, Xiao Zhang, Xuewei Cao, Zhengming Ding, Qiuying Sha, Hui Shen, Hong-Wen Deng, and Weihua Zhou. Clclsa: Cross-omics linked embedding with contrastive learning and self attention for multi-omics integration with incomplete multi-omics data.arXiv preprint arXiv:2304.05542, 2023. 25

arXiv 2023

[1] [1]

Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text.arXiv preprint arXiv:2104.11178, 2021

Hassan Akbari, Liangzhe Yuan, Rui Qian, Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, and Boqing Gong. Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text.arXiv preprint arXiv:2104.11178, 2021

arXiv 2021

[2] [2]

Deep canonical cor- relation analysis

Galen Andrew, Raman Arora, Jeff Bilmes, and Karen Livescu. Deep canonical cor- relation analysis. InProceedings of the 30th International Conference on Machine Learning, volume 28 ofProceedings of Machine Learning Research, pages 1247– 1255, 2013. 22

2013

[3] [3]

J. L. Ballard, Z. Dai, L. Shen, and Q. Long. Jasmine: A powerful representation learning method for enhanced analysis of incomplete multi-omics data.bioRxiv, pages 2025–06, 2025

2025

[4] [4]

Multimodal ma- chine learning: A survey and taxonomy.IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2):423–443, 2019

Tadas Baltru ˇsaitis, Chaitanya Ahuja, and Louis-Philippe Morency. Multimodal ma- chine learning: A survey and taxonomy.IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(2):423–443, 2019

2019

[5] [5]

Crossattomics: multiomics data integration with cross-attention.Bioinformatics, 41(6):btaf302, 2025

Aur ´elien Beaude, Franck Aug´e, Farida Zehraoui, and Blaise Hanczar. Crossattomics: multiomics data integration with cross-attention.Bioinformatics, 41(6):btaf302, 2025

2025

[6] [6]

Machine-learning-based late fu- sion on multi-omics and multi-scale data for non-small-cell lung cancer diagnosis

Francisco Carrillo-Perez, Juan Carlos Morales, Daniel Castillo-Secilla, Olivier Gevaert, Ignacio Rojas, and Luis Javier Herrera. Machine-learning-based late fu- sion on multi-omics and multi-scale data for non-small-cell lung cancer diagnosis. Journal of Personalized Medicine, 12(4):601, 2022

2022

[7] [7]

Xgboost: A scalable tree boosting system

Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. InPro- ceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pages 785–794. ACM, August 2016

2016

[8] [8]

Bailey, Eduardo Porta-Pardo, Vesteinn Thorsson, Antonio Co- laprico, Denis Bertrand, et al

Li Ding, Matthew H. Bailey, Eduardo Porta-Pardo, Vesteinn Thorsson, Antonio Co- laprico, Denis Bertrand, et al. Perspective on oncogenic processes at the end of the beginning of cancer genomics.Cell, 173(2):305–320, 2018

2018

[9] [9]

Learning factored represen- tations in a deep mixture of experts.arXiv preprint arXiv:1312.4314, 2014

David Eigen, Marc’ Aurelio Ranzato, and Ilya Sutskever. Learning factored represen- tations in a deep mixture of experts.arXiv preprint arXiv:1312.4314, 2014

Pith/arXiv arXiv 2014

[10] [10]

Huang, Judit Jan ´e-Valbuena, Gregory V

Mahmoud Ghandi, Franklin W. Huang, Judit Jan ´e-Valbuena, Gregory V. Kryukov, Candy C. Lo, E. Robert McDonald, Jordi Barretina, Ellen T. Gelfand, Craig M. Biel- ski, Hao Li, Kevin Hu, Alexander Y. Andreev-Drakhlin, Jin Seok Kim, Julian M. Hess, Brian J. Haas, Francois Aguet, Barbara A. Weir, Michael V. Rothberg, Benjamin R. Paolella, Michael S. Lawrence, ...

2019

[11] [11]

node2vec: Scalable feature learning for networks

Aditya Grover and Jure Leskovec. node2vec: Scalable feature learning for networks. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 855–864, 2016. 23

2016

[12] [12]

Harrell Jr, Kerry L

Frank E. Harrell Jr, Kerry L. Lee, and Daniel B. Mark. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.Statistics in Medicine, 15(4):361–387, 1996

1996

[13] [13]

Masked autoencoders are scalable vision learners

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Doll ´ar, and Ross Girshick. Masked autoencoders are scalable vision learners. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16000–16009, 2022

2022

[14] [14]

Relations between two sets of variates.Biometrika, 28(3/4):321– 377, 1936

Harold Hotelling. Relations between two sets of variates.Biometrika, 28(3/4):321– 377, 1936

1936

[15] [15]

Kingma and Max Welling

Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. InInterna- tional Conference on Learning Representations, 2014

2014

[16] [16]

Weijia Li, Qiao Huang, Yi Peng, Suyue Pan, Min Hu, Pu Wang, and Yuqing He. A deep learning approach based on multi-omics data integration to construct a risk stratification prediction model for skin cutaneous melanoma.Journal of Cancer Research and Clinical Oncology, 149(17):15923–15938, 2023

2023

[17] [17]

A survey of multi-view representation learning.IEEE Transactions on Knowledge and Data Engineering, 31(10):1863–1883, 2019

Yingming Li, Ming Yang, and Zhongfei Zhang. A survey of multi-view representation learning.IEEE Transactions on Knowledge and Data Engineering, 31(10):1863–1883, 2019

2019

[18] [18]

Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, and Ed H. Chi. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. InPro- ceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1930–1939, 2018

1930

[19] [19]

Moving towards genome-wide data integration for patient stratification with integrate any omics.Nature Machine Intelligence, 7(1):29–42, 2025

Shihao Ma, Andy GX Zeng, Benjamin Haibe-Kains, Anna Goldenberg, John E Dick, and Bo Wang. Moving towards genome-wide data integration for patient stratification with integrate any omics.Nature Machine Intelligence, 7(1):29–42, 2025

2025

[20] [20]

Comprehensive molecular portraits of human tumours

Cancer Genome Atlas Network. Comprehensive molecular portraits of human tumours. Nature, 2012

2012

[21] [21]

Weinstein, Eric A

The Cancer Genome Atlas Research Network, John N. Weinstein, Eric A. Collisson, Gordon B. Mills, Kenna R. Mills Shaw, Brad A. Ozenberger, Kyle Ellrott, Ilya Shmule- vich, Chris Sander, and Joshua M. Stuart. The cancer genome atlas pan-cancer analysis project.Nature Genetics, 45:1113–1120, 2013

2013

[22] [22]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sand- hini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. InProceedings of the 38th International Conference on Machine Learn- ing, volume 139 ofProceedings...

2021

[23] [23]

Siddharth, Brooks Paige, and Philip H

Yuge Shi, N. Siddharth, Brooks Paige, and Philip H. S. Torr. Variational mixture-of- experts autoencoders for multi-modal deep generative models. InAdvances in Neural Information Processing Systems, volume 32, 2019

2019

[24] [24]

Friedman, Trevor Hastie, and Robert Tibshirani

Noah Simon, Jerome H. Friedman, Trevor Hastie, and Robert Tibshirani. Regular- ization paths for cox’s proportional hazards model via coordinate descent.Journal of Statistical Software, 39:1–13, 2011

2011

[25] [25]

Multimodal deep learning for biomedical data fusion: a review.Briefings in bioinformatics, 23(2):bbab569, 2022

S ¨oren Richard Stahlschmidt, Benjamin Ulfenborg, and Jane Synnergren. Multimodal deep learning for biomedical data fusion: a review.Briefings in bioinformatics, 23(2):bbab569, 2022

2022

[26] [26]

C. X. Sun, P. Daniel, G. Bradshaw, H. Shi, M. Loi, N. Chew, S. Parackal, V. Tsui, Y. Liang, M. Koptyra, et al. Generation and multi-dimensional profiling of a childhood cancer cell line atlas defines new therapeutic opportunities.Cancer Cell, 41(4):660– 677, 2023

2023

[27] [27]

Mezlini, Feyyaz Demir, Marc Fiume, Zhuowen Tu, Michael Brudno, Benjamin Haibe-Kains, and Anna Goldenberg

Bo Wang, Aziz M. Mezlini, Feyyaz Demir, Marc Fiume, Zhuowen Tu, Michael Brudno, Benjamin Haibe-Kains, and Anna Goldenberg. Similarity network fusion for aggre- gating data types on a genomic scale.Nature Methods, 11(3):333–337, 2014

2014

[28] [28]

Multimodal generative models for scalable weakly- supervised learning

Mike Wu and Noah Goodman. Multimodal generative models for scalable weakly- supervised learning. InAdvances in Neural Information Processing Systems, vol- ume 31, 2018

2018

[29] [29]

Mind: Multimodal integration with neighbourhood-aware distributions.bioRxiv, 2025

Hanwen Xing and Christopher Yau. Mind: Multimodal integration with neighbourhood-aware distributions.bioRxiv, 2025

2025

[30] [30]

H. Xu, L. Gao, M. Huang, and R. Duan. A network embedding based method for partial multi-omics integration in cancer subtyping.Methods, 192:67–76, 2021

2021

[31] [31]

Omiembed: A unified multi-task deep learning framework for multi-omics data.Cancers, 13(12):3047, 2021

Xiaoyu Zhang, Yuting Xing, Kai Sun, and Yike Guo. Omiembed: A unified multi-task deep learning framework for multi-omics data.Cancers, 13(12):3047, 2021

2021

[32] [32]

Clclsa: Cross-omics linked embedding with contrastive learning and self attention for multi-omics integration with incomplete multi-omics data.arXiv preprint arXiv:2304.05542, 2023

Chen Zhao, Anqi Liu, Xiao Zhang, Xuewei Cao, Zhengming Ding, Qiuying Sha, Hui Shen, Hong-Wen Deng, and Weihua Zhou. Clclsa: Cross-omics linked embedding with contrastive learning and self attention for multi-omics integration with incomplete multi-omics data.arXiv preprint arXiv:2304.05542, 2023. 25

arXiv 2023