pith. machine review for the scientific record. sign in

arxiv: 2601.22158 · v3 · submitted 2026-01-29 · 💻 cs.CV

Recognition: unknown

One-step Latent-free Image Generation with Pixel Mean Flows

Authors on Pith no claims yet
classification 💻 cs.CV
keywords imagespacediffusiongenerationone-stepcoreflow-basedfurther
0
0 comments X
read the original abstract

Modern diffusion/flow-based models for image generation typically exhibit two core characteristics: (i) using multi-step sampling, and (ii) operating in a latent space. Recent advances have made encouraging progress on each aspect individually, paving the way toward one-step diffusion/flow without latents. In this work, we take a further step towards this goal and propose "pixel MeanFlow" (pMF). Our core guideline is to formulate the network output space and the loss space separately. The network target is designed to be on a presumed low-dimensional image manifold (i.e., x-prediction), while the loss is defined via MeanFlow in the velocity space. We introduce a simple transformation between the image manifold and the average velocity field. In experiments, pMF achieves strong results for one-step latent-free generation on ImageNet at 256x256 resolution (2.22 FID) and 512x512 resolution (2.48 FID), filling a key missing piece in this regime. We hope that our study will further advance the boundaries of diffusion/flow-based generative models.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Representation Fr\'echet Loss for Visual Generation

    cs.CV 2026-04 unverdicted novelty 8.0

    Fréchet Distance optimized as FD-loss in representation space by decoupling population size from batch size improves generator quality, enables one-step generation from multi-step models, and motivates a multi-represe...

  2. FREPix: Frequency-Heterogeneous Flow Matching for Pixel-Space Image Generation

    cs.CV 2026-05 unverdicted novelty 6.0

    FREPix achieves competitive FID scores on ImageNet by decomposing image generation into separate low- and high-frequency paths within a flow matching framework.

  3. Point-MF: One-step Point Cloud Generation from a Single Image via Mean Flows

    cs.CV 2026-04 unverdicted novelty 6.0

    Point-MF performs one-step point cloud reconstruction from single images by learning a mean velocity field in point space with a tailored Diffusion Transformer and a new auxiliary loss.

  4. PixelFlowCast: Latent-Free Precipitation Nowcasting via Pixel Mean Flows

    cs.CV 2026-05 unverdicted novelty 5.0

    PixelFlowCast delivers high-fidelity precipitation nowcasts from radar sequences using a latent-free Pixel Mean Flows predictor guided by a deterministic coarse stage and KANCondNet features.

  5. SubFlow: Sub-mode Conditioned Flow Matching for Diverse One-Step Generation

    cs.LG 2026-04 unverdicted novelty 5.0

    SubFlow restores full mode coverage in one-step flow matching by conditioning on sub-modes from semantic clustering, yielding higher diversity on ImageNet-256 while preserving FID.