Guided Flows for Generative Modeling and Decision Making
read the original abstract
Classifier-free guidance is a key component for enhancing the performance of conditional generative models across diverse tasks. While it has previously demonstrated remarkable improvements for the sample quality, it has only been exclusively employed for diffusion models. In this paper, we integrate classifier-free guidance into Flow Matching (FM) models, an alternative simulation-free approach that trains Continuous Normalizing Flows (CNFs) based on regressing vector fields. We explore the usage of \emph{Guided Flows} for a variety of downstream applications. We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text-to-speech synthesis, boasting state-of-the-art performance. Notably, we are the first to apply flow models for plan generation in the offline reinforcement learning setting, showcasing a 10x speedup in computation compared to diffusion models while maintaining comparable performance.
This paper has not been read by Pith yet.
Forward citations
Cited by 29 Pith papers
-
Compositional Generative Modeling from Decentralized Data
DCFM is a new decentralized framework that enforces structural constraints on generative factors across siloed data sources to produce novel compositions via peer interactions.
-
Initialization is Half the Battle: Generating Diverse Images from a Guidance Potential Posterior
DivIn samples initial noise from a guidance potential posterior via Langevin dynamics to improve diversity in class-to-image and text-to-image generation.
-
StAD: Stein Amortized Divergence for Fast Likelihoods with Diffusion and Flow
StAD distills divergence of PF-ODEs via the Langevin-Stein operator for faster, lower-variance likelihood estimation in generative models without Jacobian costs.
-
FluxFlow: Conservative Flow-Matching for Astronomical Image Super-Resolution
FluxFlow is a conservative pixel-space flow-matching framework for astronomical super-resolution that incorporates real atmospheric uncertainty and a training-free Wiener correction, outperforming baselines on a new 1...
-
Reflective Flow Sampling Enhancement
RF-Sampling enhances flow matching models by implicitly performing gradient ascent on text-image alignment scores via linear textual combinations and flow inversion.
-
TouchGuide: Inference-Time Steering of Visuomotor Policies via Touch Guidance
TouchGuide improves contact-rich robot manipulation by steering diffusion or flow-matching visuomotor policies with tactile feasibility scores from a contrastively trained Contact Physical Model.
-
Discrete Guidance Matching: Exact Guidance for Discrete Flow Matching
Derives exact guidance transition rates for discrete flow matching models that require only one model evaluation per sampling step and unify prior approximation-based methods.
-
Delta Rectified Flow Sampling for Text-to-Image Editing
DRFS is a new inversion-free editing technique for rectified flow models that models source-target velocity discrepancies and applies a time-dependent shift to improve fidelity and unify prior methods like DDS and FlowEdit.
-
Steering Your Diffusion Policy with Latent Space Reinforcement Learning
DSRL steers pretrained diffusion policies for robotics by applying RL to their latent noise inputs, achieving sample-efficient real-world adaptation with only black-box access.
-
Probabilistic Inversion with Flow Matching
Adapts Flow Matching from generative AI to probabilistic inversion, evaluated on a simple 2D velocity model and the OpenFWI seismic dataset.
-
Editing Everything Everywhere All at Once
MICE modifies joint attention biases in Multimodal Diffusion Transformers to enable concurrent multi-instance edits while reducing semantic interference via user masks.
-
Reversal Q-Learning
Reversal Q-Learning (RQL) proposes reversing flows for virtual trajectories and bias-variance reduction in an expanded MDP to train flow policies, reporting best average performance on 50 simulated robotic tasks versu...
-
Moment Matching Q-Learning
MoMa QL uses MMD moment matching to enforce distribution-level convergence of conditional score functions in flow-based RL policies for improved sampling efficiency.
-
Adversarial Dual On-Policy Distillation from Expressive Teacher
FA-OPD co-trains a flow-matching teacher and MLP student via adversarial dual on-policy distillation, improving robustness over baselines on six robot benchmarks with noisy or limited demonstrations.
-
Discrete Flow Matching for Offline-to-Online Reinforcement Learning
DRIFT enables stable offline-to-online fine-tuning of CTMC policies in discrete RL via advantage-weighted discrete flow matching, path-space regularization, and candidate-set approximation.
-
dFlowGRPO: Rate-Aware Policy Optimization for Discrete Flow Models
dFlowGRPO is a new rate-aware RL method for discrete flow models that outperforms prior GRPO approaches on image generation and matches continuous flow models while supporting broad probability paths.
-
Training-Free Image Editing with Visual Context Integration and Concept Alignment
VicoEdit performs training-free image editing by transforming source images directly with visual context and concept-alignment-guided posterior sampling, outperforming training-based methods.
-
Uncertainty-Aware Distribution-to-Distribution Flow Matching for Scientific Imaging
Bayesian Stochastic Flow Matching augments flow models with stochastic diffusion for better generalization and uses Monte Carlo Dropout with antithetic sampling to disentangle uncertainties and detect out-of-distribut...
-
Energy-Guided Generative Modeling for Low-Energy Molecular Structure Discovery
EnFlow integrates flow-based conformer generation with energy landscape modeling to enable joint ensemble generation and ground-state identification using only 1-2 ODE steps.
-
Latent Stochastic Interpolants
Latent Stochastic Interpolants jointly optimize encoder-decoder and a latent-space stochastic interpolant using a continuous-time ELBO to transform arbitrary priors into aggregated posteriors.
-
Improving Video Generation with Human Feedback
A human preference dataset and VideoReward model enable Flow-DPO and Flow-NRG to produce smoother, better-aligned videos from text prompts in flow-based generators.
-
VQActFlow: Vector-Quantized Action Mode Steering for Multi-Task Robot Manipulation
VQActFlow discretizes action chunks via vector quantization, generates code sequences with variational flow matching, and applies inference-time guidance to steer multi-task robot policies toward instructed and feasib...
-
FluxFlow: Conservative Flow-Matching for Astronomical Image Super-Resolution
FluxFlow uses conservative pixel-space flow-matching with uncertainty weights and Wiener test-time correction to outperform baselines on photometric and scientific accuracy for ground-to-space super-resolution, valida...
-
Uncertainty-Aware Distribution-to-Distribution Flow Matching for Scientific Imaging
SFM improves generalization under distribution shift for scientific imaging tasks while AVUQ supplies sample-efficient epistemic and aleatoric uncertainty estimates plus anomaly scores.
-
Cross-modal Consistency Guidance for Robust Emotion Control in Auto-Regressive TTS Models
An adaptive CFG method that tunes guidance based on LLM-detected mismatch between emotion prompts and text semantics improves emotional expressiveness in AR TTS while preserving audio quality and intelligibility.
-
Cross-modal Consistency Guidance for Robust Emotion Control in Auto-Regressive TTS Models
Introduces CCG-CFG with inconsistency-based dynamic scales and hard-sample mining distillation to boost emotional alignment in auto-regressive TTS, reporting up to 12% absolute gains in emotion recognition accuracy.
-
PolyFlow: Safe and Efficient Polytope-Constrained Flow Matching with Constraint Embedding and Projection-free Update
PolyFlow is a constrained flow matching framework that embeds polytope constraints into the model dynamics for zero-violation generation with reduced inference latency in planning and control tasks.
-
General Covariant Action Modeling: Constructing Generalized Manifolds via Spatio-Temporal Decoupling
GAM framework uses arc-length parameterization for temporal invariance and schema-affine factorization for geometric invariance to build a covariant action manifold integrated into VLA models for improved generalizati...
-
Flow Matching Guide and Code
Flow Matching is a generative modeling framework with mathematical foundations, design choices, extensions, and open-source PyTorch code for applications like image and text generation.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.