arxiv: 2511.16520 · v3 · submitted 2025-11-20 · 💻 cs.LG · cs.CV· eess.IV· eess.SP

Saving Foundation Flow-Matching Priors for Inverse Problems

Yuxiang Wan , Ryan Devera , Wenjie Zhang , Ju Sun This is my paper

Pith reviewed 2026-05-17 20:32 UTC · model grok-4.3

classification 💻 cs.LG cs.CVeess.IVeess.SP

keywords flow-matchinginverse problemsfoundation modelsgenerative priorsplug-in frameworkimage restorationscientific imagingwarm-start regularization

0 comments

The pith

A plug-in framework turns foundation flow-matching models into effective priors for inverse problems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces FMPlug to make foundation flow-matching models practical as universal priors for solving inverse problems. These models currently underperform compared to domain-specific or untrained priors because they lack tailored guidance for each new task. FMPlug adds an instance-guided time-dependent warm-start along with sharp Gaussianity regularization to provide problem-specific direction while keeping the original Gaussian structures intact. The approach is evaluated on standard image restoration and on scientific inverse problems where only a few similar samples are available. If the method works as claimed, it allows reuse of large pretrained models across many tasks without collecting new training data or training from scratch for each domain.

Core claim

FMPlug is a plug-in framework that redefines how foundation flow-matching models are applied to inverse problems by combining an instance-guided, time-dependent warm-start strategy with sharp Gaussianity regularization. This combination adds problem-specific guidance while preserving the Gaussian structures of the foundation model. Experiments on both simple image restoration tasks and scientific inverse problems that have only a few similar samples demonstrate superior results over domain-specific and untrained priors.

What carries the argument

FMPlug, a plug-in framework that integrates instance-guided time-dependent warm-start with sharp Gaussianity regularization to adapt foundation flow-matching models for specific inverse problems while retaining their Gaussian properties.

If this is right

Foundation flow-matching models can be reused as priors across different inverse problems without retraining for each one.
Scientific inverse problems become solvable even when only a small number of similar samples exist for training a specialized model.
Performance on image restoration improves beyond what either fully trained domain models or completely untrained priors currently achieve.
The cost of data collection and model training for new scientific domains drops because the same foundation model serves multiple tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same plug-in pattern of warm-start plus structure-preserving regularization could be tested on other families of generative foundation models for inverse problems.
Applying FMPlug to inverse problems in physics or biology that involve very different data modalities might expose limits in how far the Gaussian preservation holds.
Combining FMPlug with iterative refinement loops could further improve sample efficiency on problems where only one or two measurements are available.

Load-bearing premise

The instance-guided warm-start and Gaussianity regularization can be added without disrupting the useful Gaussian structures of the foundation flow-matching model or introducing biases that hurt performance on new inverse problems.

What would settle it

If applying FMPlug to a new set of inverse problems produces worse reconstructions or visible instabilities compared to the original foundation model or to untrained priors, the claim that the added components preserve performance would be falsified.

Figures

Figures reproduced from arXiv: 2511.16520 by Ju Sun, Ryan Devera, Wenjie Zhang, Yuxiang Wan.

**Figure 2.** Figure 2: Comparison between foundation FM, domain-specific FM, and untrained priors for Gaussian deblurring with varying kernel size (Gaussian sigma) and hence varying difficulty level. Notations are the same as in [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Plot of the function h(z0) (after a change of variable u = ∥z0∥ 2 2 ). An ideal regularization function should blow up sharply away from the narrow concentration region in orange to promote Gaussianity effectively. Why is the Gaussian regularization in D-Flow problematic? If z0 ∼ N (0, I), ∥z0∥ 2 2 ∼ χ 2 (d) and the negative log-likelihood is h(z0) = −(d/2 − 1) log ∥z0∥ 2 2 + ∥z0∥ 2 2 /2 + C for some con… view at source ↗

**Figure 4.** Figure 4: Visual comparison of results in Gaussian deblurring. [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Qualitative comparison of results on knee MRI and LIS. GT: groundtruth [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Qualitative comparison in super resolution [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗

**Figure 7.** Figure 7: Qualitative comparison in Inpainting task. [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗

**Figure 8.** Figure 8: Qualitative comparison in motion deblur task. [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

read the original abstract

Foundation flow-matching (FM) models promise universal priors for solving inverse problems (IPs); yet today, they trail behind domain-specific and even untrained priors. \emph{How can we unlock their potential?} We introduce FMPlug, a plug-in framework that redefines how foundation FMs are used in IPs. FMPlug combines an instance-guided, time-dependent warm-start strategy with sharp Gaussianity regularization, adding problem-specific guidance while preserving the Gaussian structures. For evaluation, we consider both simple image restoration tasks and scientific IPs with a few similar samples -- where the prohibitive cost of data collection and model training hinders the development of domain-specific generative models. Our superior experimental results confirm the effectiveness of FMPlug. Overall, FMPlug paves the way for making foundation FM models practical, reusable priors for IPs, especially scientific ones with few similar samples. More details are available at https://sun-umn.github.io/xm-plug/ .

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FMPlug adds a warm-start plus Gaussian regularization to foundation flow models for inverse problems, but the key preservation claim lacks quantitative backing on OOD scientific data.

read the letter

The core idea is a plug-in recipe called FMPlug that layers an instance-guided, time-dependent warm-start and sharp Gaussianity regularization onto pre-trained flow-matching models so they can serve as priors for inverse problems. The authors target exactly the setting where domain-specific training is too expensive: scientific IPs with only a handful of similar samples. They report better results than baselines on both standard restoration tasks and those scientific cases, which is the practical payoff they emphasize. That combination of components does not appear to be a direct restatement of earlier flow-matching or plug-and-play prior work, so the concrete recipe is the new piece. The experiments are presented as evidence that the additions improve conditioning without destroying the useful structure of the foundation model. The soft spot is the missing verification that the velocity field and marginals stay intact after the modifications, especially on out-of-distribution scientific data. The abstract asserts preservation, yet there are no reported checks such as Wasserstein distances between pre- and post-regularization marginals, changes in Lipschitz constants of the velocity, or ablations on held-out domains. Without those numbers it is difficult to know whether the warm-start or the regularization quietly shifts the prior in ways that only appear under low-data scientific conditions. The math itself builds on standard flow-matching, so that part looks solid. This is useful reading for people who already work with generative priors for imaging or remote-sensing inverse problems and want a lightweight way to reuse large models. A reader who needs a ready-to-try adaptation strategy will get value from the description and the reported gains. The work is coherent enough on its own terms to deserve a serious referee, though the authors will probably be asked to add the preservation diagnostics before acceptance.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces FMPlug, a plug-in framework for adapting foundation flow-matching (FM) models as priors for inverse problems (IPs). It combines an instance-guided, time-dependent warm-start strategy with sharp Gaussianity regularization to inject problem-specific guidance while claiming to preserve the Gaussian structures of the pre-trained FM model. The approach is evaluated on standard image restoration tasks as well as scientific IPs that involve only a few similar samples, a regime where domain-specific generative models are impractical due to data-collection costs. The authors report superior experimental results and conclude that FMPlug enables practical, reusable use of foundation FM priors, especially for scientific applications with limited data.

Significance. If the preservation of the foundation model's velocity field and marginal properties is rigorously verified and the experimental gains hold under proper controls, the work could meaningfully advance the deployment of large-scale generative priors in data-scarce scientific inverse problems. The plug-in design avoids expensive retraining and directly targets the low-sample regime highlighted in the abstract, which is a practically important setting. The project page referenced in the abstract indicates an effort toward reproducibility that would strengthen the contribution if code and checkpoints are released.

major comments (2)

[§3 (Method) and abstract] §3 (Method) and abstract: the claim that the instance-guided warm-start plus sharp Gaussianity regularization 'preserve the Gaussian structures' and transport properties of the foundation FM model lacks any quantitative verification. No measurements are reported for changes in velocity-field Lipschitz constant, Wasserstein distance between pre- and post-regularization marginals, or ablation performance on held-out scientific domains. Because the warm-start is explicitly instance- and time-dependent, this omission leaves open the possibility that the added components perturb the learned probability path precisely on the out-of-distribution scientific IPs that constitute the paper's motivating use case.
[§5 (Experiments)] §5 (Experiments): the assertion of 'superior experimental results' on scientific IPs is presented without sufficient detail on baselines, full quantitative tables, or ablation studies isolating the contribution of each plug-in component. The central claim that FMPlug makes foundation models practical for few-sample scientific IPs cannot be evaluated without these controls; the current evidence is therefore insufficient to support the headline conclusion.

minor comments (2)

[Abstract] The abstract would be strengthened by including one or two concrete performance deltas (e.g., PSNR or reconstruction error improvements) rather than the generic statement 'superior experimental results'.
[§3] Notation for the time-dependent warm-start and the 'sharp' regularization strength should be defined explicitly with symbols in the method section to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important areas for strengthening the rigor of our claims regarding preservation of the foundation model properties and the experimental validation. We address each major comment below and have revised the manuscript accordingly to incorporate additional analysis and details.

read point-by-point responses

Referee: [§3 (Method) and abstract] the claim that the instance-guided warm-start plus sharp Gaussianity regularization 'preserve the Gaussian structures' and transport properties of the foundation FM model lacks any quantitative verification. No measurements are reported for changes in velocity-field Lipschitz constant, Wasserstein distance between pre- and post-regularization marginals, or ablation performance on held-out scientific domains. Because the warm-start is explicitly instance- and time-dependent, this omission leaves open the possibility that the added components perturb the learned probability path precisely on the out-of-distribution scientific IPs that constitute the paper's motivating use case.

Authors: We agree that direct quantitative verification of preservation would strengthen the presentation. The sharp Gaussianity regularization is formulated to penalize deviations from the Gaussian marginals while the instance-guided warm-start is constructed as a minimal, time-dependent adjustment to the pre-trained velocity field. Nevertheless, the original submission did not report explicit metrics such as velocity-field Lipschitz constants or Wasserstein distances between marginals. In the revised manuscript we have added a dedicated analysis subsection to §3 that includes these measurements on both standard image domains and held-out scientific tasks. The new results show that the perturbations remain small and that performance on out-of-distribution scientific IPs is not degraded relative to the unmodified foundation model, thereby supporting the preservation claim. revision: yes
Referee: [§5 (Experiments)] the assertion of 'superior experimental results' on scientific IPs is presented without sufficient detail on baselines, full quantitative tables, or ablation studies isolating the contribution of each plug-in component. The central claim that FMPlug makes foundation models practical for few-sample scientific IPs cannot be evaluated without these controls; the current evidence is therefore insufficient to support the headline conclusion.

Authors: We acknowledge that the experimental section would benefit from greater detail to allow readers to fully evaluate the contribution. The original manuscript reported results on image restoration and a small number of scientific tasks but did not include exhaustive baseline comparisons or component-wise ablations. In the revised version we have expanded §5 with complete quantitative tables that include additional baselines (both domain-specific generative models where data permits and other plug-in priors), full numerical results with standard deviations across multiple runs, and dedicated ablation studies that isolate the instance-guided warm-start and the sharp Gaussianity regularization. These additions provide clearer evidence for the practical utility of FMPlug in the few-sample regime. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical plug-in method with external experimental validation

full rationale

The paper presents FMPlug as an empirical plug-in framework that adds an instance-guided time-dependent warm-start and sharp Gaussianity regularization to foundation flow-matching models for inverse problems. Claims rest on superior experimental results for image restoration and few-sample scientific IPs rather than any derivation chain. No equations, self-definitional steps, fitted parameters renamed as predictions, or load-bearing self-citations appear in the abstract or description. The method is externally falsifiable via held-out experiments and does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unstated premise that foundation flow-matching models already encode useful priors that can be steered with minimal modification while retaining their generative properties.

axioms (1)

domain assumption Foundation flow-matching models possess Gaussian structures that remain useful after problem-specific guidance is injected.
Invoked when the method adds regularization to preserve those structures.

pith-pipeline@v0.9.0 · 5468 in / 1238 out tokens · 51693 ms · 2026-05-17T20:32:26.859428+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

sharp Gaussianity regularization via an explicit constraint ... z ∈ S^{d-1}_ε(0, √d)
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean embed_strictMono_of_one_lt unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

time-dependent warm-start strategy ... min_{z,t} ℓ(y, A ∘ G_θ(α_t y + β_t z, t))

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 7 internal anchors

[1]

Cosmos World Foundation Model Platform for Physical AI

Agarwal, Niket et al. Cosmos world foundation model platform for physical ai.arXiv:2501.03575,

work page internal anchor Pith review Pith/arXiv arXiv
[2]

Ntire 2017 challenge on single image super-resolution: Dataset and study

Eirikur Agustsson and Radu Timofte. Ntire 2017 challenge on single image super-resolution: Dataset and study. InCVPR Workshops, July

work page 2017
[3]

Understanding untrained deep models for inverse problems: Algorithms and theory

Ismail Alkhouri, Evan Bell, Avrajit Ghosh, Shijun Liang, Rongrong Wang, and Saiprasad Rav- ishankar. Understanding untrained deep models for inverse problems: Algorithms and theory. arXiv:2502.18612,

work page arXiv
[4]

D-flow: Differentiating through flows for controlled generation.arXiv:2402.14017,

Heli Ben-Hamu, Omri Puny, Itai Gat, Brian Karrer, Uriel Singer, and Yaron Lipman. D-flow: Differentiating through flows for controlled generation.arXiv:2402.14017,

work page arXiv
[5]

FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space

Black Forest Labs et al. Flux.1 kontext: Flow matching for in-context image generation and editing in latent space.arXiv:2506.15742,

work page internal anchor Pith review Pith/arXiv arXiv
[6]

Neural Ordinary Differential Equations

URL https://arxiv.org/abs/1806.07366. Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. Stargan v2: Diverse image synthesis for multiple domains. InCVPR,

work page internal anchor Pith review Pith/arXiv arXiv
[7]

A Survey on Diffusion Models for Inverse Problems

Giannis Daras, Hyungjin Chung, Chieh-Hsin Lai, Yuki Mitsufuji, Jong Chul Ye, Peyman Milan- far, Alexandros G. Dimakis, and Mauricio Delbracio. A survey on diffusion models for inverse problems.arXiv:2410.00083,

work page internal anchor Pith review Pith/arXiv arXiv
[8]

Will Grathwohl, Ricky T

URLhttps://arxiv.org/abs/ 2506.02680. Will Grathwohl, Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, and David Duvenaud. FFJORD: Free-form continuous dynamics for scalable reversible generative models. InInterna- tional Conference on Learning Representations (ICLR),

work page arXiv
[9]

FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models

URLhttps://arxiv.org/ abs/1810.01367. Luzhe Huang, Xilin Yang, Tairan Liu, and Aydogan Ozcan. Few-shot transfer learning for holo- graphic image reconstruction using a recurrent neural network.APL Photonics, 7(7), July

work page internal anchor Pith review Pith/arXiv arXiv
[10]

doi: 10.1063/5.0090582

ISSN 2378-0967. doi: 10.1063/5.0090582. URLhttp://dx.doi.org/10.1063/5. 0090582. Jeongsol Kim, Bryan Sangwoo Kim, and Jong Chul Ye. Flowdps: Flow-driven posterior sampling for inverse problems.arXiv:2503.08136,

work page doi:10.1063/5.0090582
[11]

Self-validation: Early stopping for single-instance deep generative priors.arXiv:2110.12271,

11 Taihui Li, Zhong Zhuang, Hengyue Liang, Le Peng, Hengkang Wang, and Ju Sun. Self-validation: Early stopping for single-instance deep generative priors.arXiv:2110.12271,

work page arXiv
[12]

org/abs/2505.11720

URLhttps://arxiv. org/abs/2505.11720. Yaron Lipman, Marton Havasi, Peter Holderrieth, Neta Shaul, Matt Le, Brian Karrer, Ricky TQ Chen, David Lopez-Paz, Heli Ben-Hamu, and Itai Gat. Flow matching guide and code. arXiv:2412.06264,

work page arXiv
[13]

Martin, A

S´egol`ene Martin, Anne Gagneux, Paul Hagemann, and Gabriele Steidl. Pnp-flow: Plug-and-play image restoration with flow matching.arXiv:2410.02423,

work page arXiv
[14]

Ali Mohamad-Djafari.Inverse problems in vision and 3D tomography

URLhttps://arxiv.org/ abs/2507.06644. Ali Mohamad-Djafari.Inverse problems in vision and 3D tomography. John Wiley & Sons,

work page arXiv
[15]

doi: 10.1109/JSAIT.2020.2991563. OpenAI. Video generation models as world simulators

work page doi:10.1109/jsait.2020.2991563 2020
[16]

Maitreya Patel, Song Wen, Dimitris N Metaxas, and Yezhou Yang

Accessed: [Current Date, e.g., July 28, 2025]. Maitreya Patel, Song Wen, Dimitris N Metaxas, and Yezhou Yang. Steering rectified flow models in the vector field for controlled image generation.arXiv:2412.00100,

work page arXiv 2025
[17]

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Patrick Esser et al. Scaling rectified flow transformers for high-resolution image synthesis. arXiv:2403.03206,

work page internal anchor Pith review Pith/arXiv arXiv
[18]

Training-free linear image inverses via flows.arXiv:2310.04432,

Ashwini Pokle, Matthew J Muckley, Ricky TQ Chen, and Brian Karrer. Training-free linear image inverses via flows.arXiv:2310.04432,

work page arXiv
[19]

Vincent Sitzmann, Julien N.P

doi: 10.1038/s41551-019-0466-4. Vincent Sitzmann, Julien N.P. Martel, Alexander W. Bergman, David B. Lindell, and Gordon Wet- zstein. Implicit neural representations with periodic activation functions. InAdvances in Neural Information Processing Systems (NeurIPS), volume 33, pp. 7462–7473,

work page doi:10.1038/s41551-019-0466-4
[20]

Score-Based Generative Modeling through Stochastic Differential Equations

URLhttps://arxiv.org/ abs/2011.13456. 12 Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. Deep image prior.IJCV, 128(7): 1867–1888, March

work page internal anchor Pith review Pith/arXiv arXiv 2011
[21]

doi: 10.1007/s11263-020-01303-4

ISSN 1573-1405. doi: 10.1007/s11263-020-01303-4. Roman Vershynin.High-dimensional probability: An introduction with applications in data science, volume

work page doi:10.1007/s11263-020-01303-4
[22]

Baraniuk, Ashok Veer- araghavan, and Guha Balakrishnan

Kushal Vyas, Ahmed Imtiaz Humayun, Aniket Dashpute, Richard G. Baraniuk, Ashok Veer- araghavan, and Guha Balakrishnan. Learning transferable features for implicit neural repre- sentations.ArXiv, abs/2409.09566,

work page arXiv
[23]

Yuxiang Wan, Ryan Devera, Wenjie Zhang, and Ju Sun

URLhttps://api.semanticscholar.org/ CorpusID:272689239. Yuxiang Wan, Ryan Devera, Wenjie Zhang, and Ju Sun. Fmplug: Plug-in foundation flow-matching priors for inverse problems.arXiv preprint arXiv:2508.00721,

work page arXiv
[24]

Hengkang Wang, Xu Zhang, Taihui Li, Yuxiang Wan, Tiancong Chen, and Ju Sun

ISSN 2835-8856. Hengkang Wang, Xu Zhang, Taihui Li, Yuxiang Wan, Tiancong Chen, and Ju Sun. Dmplug: A plug-in method for solving inverse problems with diffusion models.ArXiv:2405.16749,

work page arXiv
[25]

Temporal-consistent video restoration with pre-trained diffusion models

Hengkang Wang, Yang Liu, Huidong Liu, Chien-Chih Wang, Yanhui Guo, Hongdong Li, Bryan Wang, and Ju Sun. Temporal-consistent video restoration with pre-trained diffusion models. arXiv:2503.14863,

work page arXiv
[26]

Guidance with spherical gaussian constraint for conditional diffusion.arXiv:2402.03201,

Lingxiao Yang, Shutong Ding, Yifan Cai, Jingyi Yu, Jingya Wang, and Ye Shi. Guidance with spherical gaussian constraint for conditional diffusion.arXiv:2402.03201,

work page arXiv
[27]

What is wrong with end-to-end learning for phase retrieval?arXiv:2403.15448,

Wenjie Zhang, Yuxiang Wan, Zhong Zhuang, and Ju Sun. What is wrong with end-to-end learning for phase retrieval?arXiv:2403.15448,

work page arXiv
[28]

Zhong Zhuang, Taihui Li, Hengkang Wang, and Ju Sun

URLhttps://arxiv.org/abs/2503.11043. Zhong Zhuang, Taihui Li, Hengkang Wang, and Ju Sun. Blind Image Deblurring with Unknown Kernel Size and Substantial Noise.IJCV, September 2023a. ISSN 1573-1405. doi: 10.1007/ s11263-023-01883-x. Zhong Zhuang, David Yang, Felix Hofmann, David Barmherzig, and Ju Sun. Practical phase re- trieval using double deep image pr...

work page doi:10.2352/ei.2023.35.14.coimg-153 2023
[29]

•FMPlugWe useAdamWas our default optimizer

as the backbone model whenever foundation FM models are needed. •FMPlugWe useAdamWas our default optimizer. The number of function evaluations (NFE) is3and we use theHeun2ODE solver to balance efficiency and accuracy. The learning rate forzis0.5, and fortis0.005. •D-FlowWe use their default optimizer:LBFGSalgorithm with line search. The NFE= 6 with theHeu...

work page 2025