Guided Flows for Generative Modeling and Decision Making

Aditya Grover; Matt Le; Neta Shaul; Qinqing Zheng; Ricky T. Q. Chen; Yaron Lipman

arxiv: 2311.13443 · v2 · pith:EEYLBIWNnew · submitted 2023-11-22 · 💻 cs.LG · cs.AI· cs.CV· cs.RO· stat.ML

Guided Flows for Generative Modeling and Decision Making

Qinqing Zheng , Matt Le , Neta Shaul , Yaron Lipman , Aditya Grover , Ricky T. Q. Chen This is my paper

classification 💻 cs.LG cs.AIcs.CVcs.ROstat.ML

keywords modelsflowsguidedperformanceclassifier-freeconditionaldiffusionflow

0 comments

read the original abstract

Classifier-free guidance is a key component for enhancing the performance of conditional generative models across diverse tasks. While it has previously demonstrated remarkable improvements for the sample quality, it has only been exclusively employed for diffusion models. In this paper, we integrate classifier-free guidance into Flow Matching (FM) models, an alternative simulation-free approach that trains Continuous Normalizing Flows (CNFs) based on regressing vector fields. We explore the usage of \emph{Guided Flows} for a variety of downstream applications. We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text-to-speech synthesis, boasting state-of-the-art performance. Notably, we are the first to apply flow models for plan generation in the offline reinforcement learning setting, showcasing a 10x speedup in computation compared to diffusion models while maintaining comparable performance.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 29 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Compositional Generative Modeling from Decentralized Data
cs.LG 2026-06 unverdicted novelty 7.0

DCFM is a new decentralized framework that enforces structural constraints on generative factors across siloed data sources to produce novel compositions via peer interactions.
Initialization is Half the Battle: Generating Diverse Images from a Guidance Potential Posterior
cs.CV 2026-06 unverdicted novelty 7.0

DivIn samples initial noise from a guidance potential posterior via Langevin dynamics to improve diversity in class-to-image and text-to-image generation.
StAD: Stein Amortized Divergence for Fast Likelihoods with Diffusion and Flow
stat.ML 2026-05 unverdicted novelty 7.0

StAD distills divergence of PF-ODEs via the Langevin-Stein operator for faster, lower-variance likelihood estimation in generative models without Jacobian costs.
FluxFlow: Conservative Flow-Matching for Astronomical Image Super-Resolution
cs.CV 2026-05 unverdicted novelty 7.0

FluxFlow is a conservative pixel-space flow-matching framework for astronomical super-resolution that incorporates real atmospheric uncertainty and a training-free Wiener correction, outperforming baselines on a new 1...
Reflective Flow Sampling Enhancement
cs.CV 2026-03 unverdicted novelty 7.0

RF-Sampling enhances flow matching models by implicitly performing gradient ascent on text-image alignment scores via linear textual combinations and flow inversion.
TouchGuide: Inference-Time Steering of Visuomotor Policies via Touch Guidance
cs.RO 2026-01 unverdicted novelty 7.0

TouchGuide improves contact-rich robot manipulation by steering diffusion or flow-matching visuomotor policies with tactile feasibility scores from a contrastively trained Contact Physical Model.
Discrete Guidance Matching: Exact Guidance for Discrete Flow Matching
cs.LG 2025-09 conditional novelty 7.0

Derives exact guidance transition rates for discrete flow matching models that require only one model evaluation per sampling step and unify prior approximation-based methods.
Delta Rectified Flow Sampling for Text-to-Image Editing
cs.CV 2025-09 unverdicted novelty 7.0

DRFS is a new inversion-free editing technique for rectified flow models that models source-target velocity discrepancies and applies a time-dependent shift to improve fidelity and unify prior methods like DDS and FlowEdit.
Steering Your Diffusion Policy with Latent Space Reinforcement Learning
cs.RO 2025-06 unverdicted novelty 7.0

DSRL steers pretrained diffusion policies for robotics by applying RL to their latent noise inputs, achieving sample-efficient real-world adaptation with only black-box access.
Probabilistic Inversion with Flow Matching
cs.LG 2026-06 unverdicted novelty 6.0

Adapts Flow Matching from generative AI to probabilistic inversion, evaluated on a simple 2D velocity model and the OpenFWI seismic dataset.
Editing Everything Everywhere All at Once
cs.CV 2026-06 unverdicted novelty 6.0

MICE modifies joint attention biases in Multimodal Diffusion Transformers to enable concurrent multi-instance edits while reducing semantic interference via user masks.
Reversal Q-Learning
cs.LG 2026-06 unverdicted novelty 6.0

Reversal Q-Learning (RQL) proposes reversing flows for virtual trajectories and bias-variance reduction in an expanded MDP to train flow policies, reporting best average performance on 50 simulated robotic tasks versu...
Moment Matching Q-Learning
cs.LG 2026-05 unverdicted novelty 6.0

MoMa QL uses MMD moment matching to enforce distribution-level convergence of conditional score functions in flow-based RL policies for improved sampling efficiency.
Adversarial Dual On-Policy Distillation from Expressive Teacher
cs.LG 2026-05 unverdicted novelty 6.0

FA-OPD co-trains a flow-matching teacher and MLP student via adversarial dual on-policy distillation, improving robustness over baselines on six robot benchmarks with noisy or limited demonstrations.
Discrete Flow Matching for Offline-to-Online Reinforcement Learning
cs.LG 2026-05 unverdicted novelty 6.0

DRIFT enables stable offline-to-online fine-tuning of CTMC policies in discrete RL via advantage-weighted discrete flow matching, path-space regularization, and candidate-set approximation.
dFlowGRPO: Rate-Aware Policy Optimization for Discrete Flow Models
cs.LG 2026-05 unverdicted novelty 6.0

dFlowGRPO is a new rate-aware RL method for discrete flow models that outperforms prior GRPO approaches on image generation and matches continuous flow models while supporting broad probability paths.
Training-Free Image Editing with Visual Context Integration and Concept Alignment
cs.CV 2026-04 unverdicted novelty 6.0

VicoEdit performs training-free image editing by transforming source images directly with visual context and concept-alignment-guided posterior sampling, outperforming training-based methods.
Uncertainty-Aware Distribution-to-Distribution Flow Matching for Scientific Imaging
cs.LG 2026-03 unverdicted novelty 6.0

Bayesian Stochastic Flow Matching augments flow models with stochastic diffusion for better generalization and uses Monte Carlo Dropout with antithetic sampling to disentangle uncertainties and detect out-of-distribut...
Energy-Guided Generative Modeling for Low-Energy Molecular Structure Discovery
cs.LG 2025-12 unverdicted novelty 6.0

EnFlow integrates flow-based conformer generation with energy landscape modeling to enable joint ensemble generation and ground-state identification using only 1-2 ODE steps.
Latent Stochastic Interpolants
cs.LG 2025-06 unverdicted novelty 6.0

Latent Stochastic Interpolants jointly optimize encoder-decoder and a latent-space stochastic interpolant using a continuous-time ELBO to transform arbitrary priors into aggregated posteriors.
Improving Video Generation with Human Feedback
cs.CV 2025-01 unverdicted novelty 6.0

A human preference dataset and VideoReward model enable Flow-DPO and Flow-NRG to produce smoother, better-aligned videos from text prompts in flow-based generators.
VQActFlow: Vector-Quantized Action Mode Steering for Multi-Task Robot Manipulation
cs.RO 2026-06 unverdicted novelty 5.0

VQActFlow discretizes action chunks via vector quantization, generates code sequences with variational flow matching, and applies inference-time guidance to steer multi-task robot policies toward instructed and feasib...
FluxFlow: Conservative Flow-Matching for Astronomical Image Super-Resolution
cs.CV 2026-05 unverdicted novelty 5.0

FluxFlow uses conservative pixel-space flow-matching with uncertainty weights and Wiener test-time correction to outperform baselines on photometric and scientific accuracy for ground-to-space super-resolution, valida...
Uncertainty-Aware Distribution-to-Distribution Flow Matching for Scientific Imaging
cs.LG 2026-03 unverdicted novelty 5.0

SFM improves generalization under distribution shift for scientific imaging tasks while AVUQ supplies sample-efficient epistemic and aleatoric uncertainty estimates plus anomaly scores.
Cross-modal Consistency Guidance for Robust Emotion Control in Auto-Regressive TTS Models
cs.CL 2025-10 unverdicted novelty 5.0

An adaptive CFG method that tunes guidance based on LLM-detected mismatch between emotion prompts and text semantics improves emotional expressiveness in AR TTS while preserving audio quality and intelligibility.
Cross-modal Consistency Guidance for Robust Emotion Control in Auto-Regressive TTS Models
cs.CL 2025-10 unverdicted novelty 5.0

Introduces CCG-CFG with inconsistency-based dynamic scales and hard-sample mining distillation to boost emotional alignment in auto-regressive TTS, reporting up to 12% absolute gains in emotion recognition accuracy.
PolyFlow: Safe and Efficient Polytope-Constrained Flow Matching with Constraint Embedding and Projection-free Update
cs.LG 2026-06 unverdicted novelty 4.0

PolyFlow is a constrained flow matching framework that embeds polytope constraints into the model dynamics for zero-violation generation with reduced inference latency in planning and control tasks.
General Covariant Action Modeling: Constructing Generalized Manifolds via Spatio-Temporal Decoupling
cs.CV 2026-05 unverdicted novelty 4.0

GAM framework uses arc-length parameterization for temporal invariance and schema-affine factorization for geometric invariance to build a covariant action manifold integrated into VLA models for improved generalizati...
Flow Matching Guide and Code
cs.LG 2024-12 unverdicted novelty 2.0

Flow Matching is a generative modeling framework with mathematical foundations, design choices, extensions, and open-source PyTorch code for applications like image and text generation.