Auto-encoding variational bayes

Diederik P Kingma, Max Welling, et al · 2013

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

browse 9 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Score-based Membership Inference on Diffusion Models

cs.LG · 2025-09-29 · unverdicted · novelty 7.0

Presents SimA, a score-based single-query membership inference attack for diffusion models and LDMs that uses denoiser output norm to reveal training set proximity and outperforms multi-query baselines on eight datasets.

AudioMoG: Guiding Audio Generation with Mixture-of-Guidance

cs.SD · 2025-09-28 · unverdicted · novelty 7.0

AudioMoG is a mixture-of-guidance sampling technique that combines CFG and AG signals to outperform single-guidance baselines in text-to-audio generation at equivalent speed.

DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under Uncertainty

cs.LG · 2025-06-14 · unverdicted · novelty 7.0

DR-SAC is the first actor-critic distributionally robust RL algorithm for offline continuous control that derives a convergent robust soft policy iteration and reports up to 9.8x higher rewards than SAC under perturbations.

Negative Binomial Variational Autoencoders for Overdispersed Latent Modeling

cs.LG · 2025-08-07 · unverdicted · novelty 6.0

NegBio-VAE introduces negative binomial latents with dispersion to handle overdispersion in discrete VAE models, yielding better reconstruction, generation, and downstream representations than Poisson VAE baselines.

Matrix-game 2.0: An open-source real-time and streaming interactive world model

cs.CV · 2025-08-18 · unverdicted · novelty 5.0

Matrix-Game 2.0 introduces a scalable data pipeline, action-injection module, and few-step distillation to enable real-time streaming video generation at 25 FPS from game-engine interactions, with open-sourced weights and code.

Geometry-aware 4D Video Generation for Robot Manipulation

cs.CV · 2025-07-01 · unverdicted · novelty 5.0

A geometry-aware 4D video generation model trained with cross-view pointmap alignment to produce spatio-temporally consistent future videos from novel viewpoints for robot manipulation.

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

cs.CV · 2025-05-14 · conditional · novelty 5.0

BLIP3-o uses a diffusion transformer to generate CLIP image features and a sequential pretraining strategy to build open models that perform strongly on both image understanding and generation benchmarks.

Information Bottleneck-Guided Heterogeneous Graph Learning for Interpretable Neurodevelopmental Disorder Diagnosis

cs.CV · 2025-02-28 · unverdicted · novelty 5.0

Proposes the I2B-HGNN framework using information bottleneck-guided graph transformers and heterogeneous graph attention for interpretable multimodal NDD diagnosis.

VRAG: Learning World Models for Interactive Video Generation

cs.CV · 2025-05-28

citing papers explorer

Showing 9 of 9 citing papers.

Score-based Membership Inference on Diffusion Models cs.LG · 2025-09-29 · unverdicted · none · ref 21
Presents SimA, a score-based single-query membership inference attack for diffusion models and LDMs that uses denoiser output norm to reveal training set proximity and outperforms multi-query baselines on eight datasets.
AudioMoG: Guiding Audio Generation with Mixture-of-Guidance cs.SD · 2025-09-28 · unverdicted · none · ref 37
AudioMoG is a mixture-of-guidance sampling technique that combines CFG and AG signals to outperform single-guidance baselines in text-to-audio generation at equivalent speed.
DR-SAC: Distributionally Robust Soft Actor-Critic for Reinforcement Learning under Uncertainty cs.LG · 2025-06-14 · unverdicted · none · ref 19
DR-SAC is the first actor-critic distributionally robust RL algorithm for offline continuous control that derives a convergent robust soft policy iteration and reports up to 9.8x higher rewards than SAC under perturbations.
Negative Binomial Variational Autoencoders for Overdispersed Latent Modeling cs.LG · 2025-08-07 · unverdicted · none · ref 19
NegBio-VAE introduces negative binomial latents with dispersion to handle overdispersion in discrete VAE models, yielding better reconstruction, generation, and downstream representations than Poisson VAE baselines.
Matrix-game 2.0: An open-source real-time and streaming interactive world model cs.CV · 2025-08-18 · unverdicted · none · ref 20
Matrix-Game 2.0 introduces a scalable data pipeline, action-injection module, and few-step distillation to enable real-time streaming video generation at 25 FPS from game-engine interactions, with open-sourced weights and code.
Geometry-aware 4D Video Generation for Robot Manipulation cs.CV · 2025-07-01 · unverdicted · none · ref 35
A geometry-aware 4D video generation model trained with cross-view pointmap alignment to produce spatio-temporally consistent future videos from novel viewpoints for robot manipulation.
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset cs.CV · 2025-05-14 · conditional · none · ref 12
BLIP3-o uses a diffusion transformer to generate CLIP image features and a sequential pretraining strategy to build open models that perform strongly on both image understanding and generation benchmarks.
Information Bottleneck-Guided Heterogeneous Graph Learning for Interpretable Neurodevelopmental Disorder Diagnosis cs.CV · 2025-02-28 · unverdicted · none · ref 46
Proposes the I2B-HGNN framework using information bottleneck-guided graph transformers and heterogeneous graph attention for interpretable multimodal NDD diagnosis.
VRAG: Learning World Models for Interactive Video Generation cs.CV · 2025-05-28 · unreviewed · ref 30

Auto-encoding variational bayes

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer