Joint Multimodal Learning with Deep Generative Models

Suzuki, M · 2016 · stat.ML · arXiv 1611.01891

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

We investigate deep generative models that can exchange multiple modalities bi-directionally, e.g., generating images from corresponding texts and vice versa. Recently, some studies handle multiple modalities on deep generative models, such as variational autoencoders (VAEs). However, these models typically assume that modalities are forced to have a conditioned relation, i.e., we can only generate modalities in one direction. To achieve our objective, we should extract a joint representation that captures high-level concepts among all modalities and through which we can exchange them bi-directionally. As described herein, we propose a joint multimodal variational autoencoder (JMVAE), in which all modalities are independently conditioned on joint representation. In other words, it models a joint distribution of modalities. Furthermore, to be able to generate missing modalities from the remaining modalities properly, we develop an additional method, JMVAE-kl, that is trained by reducing the divergence between JMVAE's encoder and prepared networks of respective modalities. Our experiments show that our proposed method can obtain appropriate joint representation from multiple modalities and that it can generate and reconstruct them more properly than conventional VAEs. We further demonstrate that JMVAE can generate multiple modalities bi-directionally.

representative citing papers

Multi-Component VAE with Gaussian Markov Random Field

cs.LG · 2025-07-16 · unverdicted · novelty 6.0

GMRF MCVAE embeds Gaussian Markov Random Fields into VAE prior and posterior distributions to explicitly model cross-component relationships, reporting SOTA results on a synthetic Copula dataset and improved coherence on BIKED.

MO-RiskVAE: A Multi-Omics Variational Autoencoder for Survival Risk Modeling in Multiple MyelomaMO-RiskVAE

cs.LG · 2026-04-07 · unverdicted · novelty 5.0

Moderate relaxation of KL regularization and hybrid continuous-discrete latent spaces improve survival discrimination in multi-omics VAEs for multiple myeloma.

citing papers explorer

Showing 2 of 2 citing papers.

Multi-Component VAE with Gaussian Markov Random Field cs.LG · 2025-07-16 · unverdicted · none · ref 12 · internal anchor
GMRF MCVAE embeds Gaussian Markov Random Fields into VAE prior and posterior distributions to explicitly model cross-component relationships, reporting SOTA results on a synthetic Copula dataset and improved coherence on BIKED.
MO-RiskVAE: A Multi-Omics Variational Autoencoder for Survival Risk Modeling in Multiple MyelomaMO-RiskVAE cs.LG · 2026-04-07 · unverdicted · none · ref 9
Moderate relaxation of KL regularization and hybrid continuous-discrete latent spaces improve survival discrimination in multi-omics VAEs for multiple myeloma.

Joint Multimodal Learning with Deep Generative Models

fields

years

verdicts

representative citing papers

citing papers explorer