Flux Matching generalizes score-based generative modeling by using a weaker objective that admits infinitely many non-conservative vector fields with the data as stationary distribution, enabling new design choices beyond traditional score matching.
hub Mixed citations
Density estimation using Real NVP
Mixed citation behavior. Most common role is background (62%).
abstract
Unsupervised learning of probabilistic models is a central yet challenging problem in machine learning. Specifically, designing models with tractable learning, sampling, inference and evaluation is crucial in solving this task. We extend the space of such models using real-valued non-volume preserving (real NVP) transformations, a set of powerful invertible and learnable transformations, resulting in an unsupervised learning algorithm with exact log-likelihood computation, exact sampling, exact inference of latent variables, and an interpretable latent space. We demonstrate its ability to model natural images on four datasets through sampling, log-likelihood evaluation and latent variable manipulations.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Rectified flow learns straight-path neural ODEs for distribution transport, yielding efficient generative models and domain transfers that work well even with a single simulation step.
Introduces an SDE-based framework for score-based generative modeling that unifies prior methods, enables predictor-corrector sampling and neural ODE likelihoods, and achieves SOTA unconditional image generation on CIFAR-10.
DDIMs construct non-Markovian diffusion processes that share DDPM training objectives but allow much faster reverse sampling, demonstrated empirically at 10-50x wall-clock speedup.
Denoising diffusion probabilistic models generate high-quality images by learning to reverse a fixed forward diffusion process, achieving FID 3.17 on CIFAR10.
A policy network learns to choose unmasking order in masked diffusion by reweighting the loss, outperforming random and heuristic baselines on ordering-sensitive tasks.
Presents a controlled vector field framework for continuous generative modeling where velocity is formed from fixed bracket-generating fields modulated by scalar controls, with an expressivity principle under controllability assumptions.
A coupling-flow global proposal for Monte Carlo sampling in 2D pure SU(2) lattice gauge theory is shown to be formally valid and to reproduce the target ensemble in proof-of-principle tests, with modest hybrid gains but no clear outperformance over local baselines.
A flow matching generative model produces weak lensing mass maps with fidelity improved to below 1% and 5% on basic and higher-order statistics relative to GAN benchmarks.
DriftXpress approximates drifting kernels via projected RKHS fields to lower training cost of one-step generative models while matching original FID scores.
NTM models each generative reverse step as a conditional normalizing flow with a hybrid shallow-deep architecture, enabling exact-likelihood training and strong four-step sampling performance on text-to-image tasks.
Neural scaling laws are invariant under bijective data transformations and change predictably with information resolution ρ under non-bijective transformations, enabling cross-domain transport of fitted exponents.
TRACE creates valid conformal prediction sets for complex generative models by scoring outputs via averaged denoising or velocity errors along stochastic transport paths instead of likelihoods.
A nonparametric pixel-based Bayesian method integrates TMD evolution with generative AI sampling and SVD to extract parton distributions and identify unconstrained null components from multi-scale observables.
Risk-controlled post-processing yields a threshold-structured policy that follows the baseline except where an oracle fallback sharply reduces conditional violation risk, achieving O(log n/n) expected excess risk in i.i.d. settings and exact risk control under exchangeability.
NF-NPCDR enhances neural processes with normalizing flows to model personalized multi-interest preferences and uses a preference pool plus adaptive decoder to improve cross-domain recommendations for cold-start users.
MOFAT applied to SN2024ggi shows CO triggering inner SiO formation with a receding edge, order-of-magnitude mass drop, clumping signatures, and no dust formation.
MorphoFlow learns compact probabilistic 3D shape representations from sparse annotations using neural implicits, autodecoders, autoregressive flows, and adaptive sparsity priors on latent dimensions.
VaFES constructs a latent space from reversible collective variables and variationally optimizes a tractable-density generative model to produce a continuous free energy surface from which rare events are directly sampled.
Differentiable nonconformity scores induce flows that sample conformal prediction set boundaries, and mixing flows across levels produces conformal predictive distributions whose quantiles match the sets.
ARDIS enables arbitrary-resolution deep image steganography via frequency decoupling in hiding and latent-guided implicit reconstruction for blind recovery.
RGFlow uses flow-based neural networks to learn bijective real-space RG transformations for the 2D phi^4 theory, identifying a Wilson-Fisher-like critical point and estimating the correlation length exponent.
DSRL steers pretrained diffusion policies for robotics by applying RL to their latent noise inputs, achieving sample-efficient real-world adaptation with only black-box access.
vsOED uses a variational one-point reward and RL policy optimization to provide a lower bound on expected information gain for sequential experimental design, supporting nuisance parameters, implicit likelihoods, and multiple design goals.
citing papers explorer
-
Application of deep neural networks for computing the renormalization group flow of the two-dimensional phi^4 field theory
RGFlow uses flow-based neural networks to learn bijective real-space RG transformations for the 2D phi^4 theory, identifying a Wilson-Fisher-like critical point and estimating the correlation length exponent.
-
Steering Your Diffusion Policy with Latent Space Reinforcement Learning
DSRL steers pretrained diffusion policies for robotics by applying RL to their latent noise inputs, achieving sample-efficient real-world adaptation with only black-box access.
-
The Ensemble Schr{\"o}dinger Bridge filter for Nonlinear Data Assimilation
The Ensemble Schrödinger Bridge filter adds a diffusion-based analysis step to ensemble prediction, enabling effective nonlinear data assimilation without structural model error or training.
-
RefTon: Reference person shot assist virtual Try-on
RefTon is a flux-based virtual try-on method that uses unpaired reference images of the target garment on different people to guide texture and detail preservation in a streamlined person-to-person pipeline without body parsing or masks.
-
Marginal Girsanov Reweighting: Stable Variance Reduction for Long-Timescale Dynamics from Biased Simulation
Marginal Girsanov Reweighting stabilizes variance by marginalizing over intermediate paths to enable reliable reweighting of long-timescale dynamics from biased molecular simulations.
-
Energy-Weighted Flow Matching: Unlocking Continuous Normalizing Flows for Efficient and Scalable Boltzmann Sampling
Energy-Weighted Flow Matching reformulates conditional flow matching with importance sampling to enable continuous normalizing flows to model Boltzmann distributions from energy evaluations alone, with iterative and annealed variants showing competitive performance on benchmarks.
-
Improving the Accuracy of Amortized Model Comparison with Self-Consistency
Self-consistency training on real data improves amortized Bayesian model comparison accuracy under distribution shifts, especially in open-world misspecification when analytic or locally accurate surrogate likelihoods are available.
-
TabICL: A Tabular Foundation Model for In-Context Learning on Large Data
TabICL scales in-context learning to large tabular data via column-then-row attention for row embeddings followed by a transformer, matching TabPFNv2 speed and performance while outperforming it and CatBoost on datasets over 10K samples.
-
Benchmarking Vision Foundation Models for Input Monitoring in Autonomous Driving
Vision foundation model embeddings with density modeling outperform state-of-the-art methods for unsupervised semantic and covariate shift detection in autonomous driving inputs.
-
PrefPaint: Enhancing Medical Image Inpainting through Expert Human Feedback
PrefPaint uses D3PO and a Model Tree web interface to incorporate gastroenterologist feedback into Stable Diffusion inpainting, producing anatomically accurate polyp images that outperform prior methods in user studies.
-
Towards a Universal Foundation Model for Protein Dynamics: A Multi-Chain Tree-Structured Framework with Transformer Propagators
Proposes TSCG hierarchical representation and Transformer propagator for universal coarse-grained protein MD with claimed 10k-20k times acceleration over all-atom MD while preserving statistical properties.