HazeMatching: Dehazing Light Microscopy Images with Guided Conditional Flow Matching
Pith reviewed 2026-05-19 07:42 UTC · model grok-4.3
The pith
HazeMatching guides conditional flow matching with hazy observations to dehaze microscopy images while balancing fidelity and realism.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
HazeMatching achieves a consistent balance between fidelity and realism on average across five datasets while producing well-calibrated predictions and without requiring an explicit degradation operator.
What carries the argument
The conditional velocity field in the flow matching process, steered directly by the hazy observation to generate the clean image.
If this is right
- Dehazing becomes practical on real widefield data because no explicit forward degradation model is needed.
- Outputs are well-calibrated, supporting downstream tasks that rely on uncertainty quantification.
- The same guided flow-matching structure can be reused for other image restoration problems that lack known degradation operators.
- Average performance across distortion and perceptual metrics improves over methods that optimize only one objective.
Where Pith is reading between the lines
- The approach might generalize to other modalities such as electron microscopy or medical imaging where similar out-of-focus effects occur.
- Calibration properties could be leveraged for ensemble methods or active learning in high-throughput screening pipelines.
- Varying the strength of the hazy observation guidance during sampling could produce controllable trade-offs for different scientific use cases.
Load-bearing premise
Guiding the conditional velocity field directly with the hazy observation is sufficient to produce both high-fidelity and perceptually realistic outputs without introducing bias or mode collapse.
What would settle it
A new microscopy dataset with paired hazy and ground-truth clean images where HazeMatching shows either large drops in PSNR relative to fidelity-focused baselines or poor calibration scores on held-out uncertainty estimates.
Figures
read the original abstract
Fluorescence microscopy is a major driver of scientific progress in the life sciences. Although high-end confocal microscopes are capable of filtering out-of-focus light, cheaper and more accessible microscopy modalities, such as widefield microscopy, can not, which consequently leads to hazy image data. Computational dehazing is trying to combine the best of both worlds, leading to cheap microscopy but crisp-looking images. The perception-distortion trade-off tells us that we can optimize either for data fidelity, e.g. low MSE or high PSNR, or for data realism, measured by perceptual metrics such as LPIPS or FID. Existing methods either prioritize fidelity at the expense of realism, or produce perceptually convincing results that lack quantitative accuracy. In this work, we propose HazeMatching, a novel iterative method for dehazing light microscopy images, which effectively balances these objectives. Our goal was to find a balanced trade-off between the fidelity of the dehazing results and the realism of individual predictions (samples). We achieve this by adapting the conditional flow matching framework by guiding the generative process with a hazy observation in the conditional velocity field. We evaluate HazeMatching on 5 datasets, covering both synthetic and real data, assessing both distortion and perceptual quality. Our method is compared against 12 baselines, achieving a consistent balance between fidelity and realism on average. Additionally, with calibration analysis, we show that HazeMatching produces well-calibrated predictions. Note that our method does not need an explicit degradation operator to exist, making it easily applicable on real microscopy data. All data used for training and evaluation and our code will be publicly available under a permissive license.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces HazeMatching, an adaptation of conditional flow matching for dehazing widefield fluorescence microscopy images. By directly guiding the conditional velocity field with the hazy observation, the method seeks to balance quantitative fidelity (PSNR/MSE) and perceptual realism (LPIPS/FID) without requiring an explicit degradation operator. It reports evaluation on five datasets (synthetic and real) against twelve baselines, claiming consistent average performance balance plus well-calibrated predictions, with code and data to be released publicly.
Significance. If the empirical claims hold under more rigorous validation, the work would provide a practical, degradation-model-free approach to improving image quality in accessible microscopy modalities, which could benefit life-sciences applications. The public release of code and data is a clear strength that supports reproducibility. The contribution is primarily empirical rather than theoretical, extending flow-matching techniques to a new domain with a focus on the perception-distortion trade-off.
major comments (3)
- [§4] §4 (Experiments), Table 2 and associated text: the claim of 'consistent balance between fidelity and realism on average' across five datasets rests on reported average rankings, yet no statistical significance tests (e.g., Wilcoxon signed-rank or paired t-tests with correction) or per-dataset standard deviations are provided; this weakens the robustness of the central empirical claim.
- [§3.2] §3.2 (Method, conditional velocity field): the guidance of v_t(x | hazy) is presented as sufficient to avoid both under-fitting the conditional mean and mode collapse, but the manuscript contains no ablations on guidance strength, no visualizations of the learned velocity field, and no diversity metrics (e.g., variance across multiple samples per input); these omissions directly affect the validity of the no-bias/no-collapse assumption.
- [§4.3] §4.3 (Real data evaluation): training relies on synthetic haze pairs even for real microscopy test images, yet no domain-gap analysis or sensitivity study is reported; an unmodeled shift could silently bias the conditional mean while still producing plausible perceptual scores, undermining applicability claims for real data.
minor comments (2)
- [Abstract / §1] The abstract and §1 mention 'well-calibrated predictions' but the calibration plot details (binning, expected calibration error formula) appear only in supplementary material; moving a concise description to the main text would improve clarity.
- [§3] Notation for the hazy observation (denoted y or I_hazy) is used inconsistently across equations in §3; a single consistent symbol and a short notation table would reduce reader effort.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below, indicating where revisions will be made to improve the robustness and clarity of our empirical claims and methodological justifications.
read point-by-point responses
-
Referee: §4 (Experiments), Table 2 and associated text: the claim of 'consistent balance between fidelity and realism on average' across five datasets rests on reported average rankings, yet no statistical significance tests (e.g., Wilcoxon signed-rank or paired t-tests with correction) or per-dataset standard deviations are provided; this weakens the robustness of the central empirical claim.
Authors: We agree that statistical validation would strengthen the central claim. In the revised manuscript we will augment Table 2 with per-dataset standard deviations for all metrics and add paired statistical tests (Wilcoxon signed-rank with Bonferroni correction) comparing HazeMatching against the top baselines across the five datasets. These additions will be reported in §4. revision: yes
-
Referee: §3.2 (Method, conditional velocity field): the guidance of v_t(x | hazy) is presented as sufficient to avoid both under-fitting the conditional mean and mode collapse, but the manuscript contains no ablations on guidance strength, no visualizations of the learned velocity field, and no diversity metrics (e.g., variance across multiple samples per input); these omissions directly affect the validity of the no-bias/no-collapse assumption.
Authors: We acknowledge the value of these supporting analyses. The revised version will include (i) an ablation study on guidance strength, (ii) qualitative visualizations of the learned conditional velocity fields at selected timesteps, and (iii) quantitative diversity metrics (sample variance and pairwise LPIPS across 10 generations per input) to substantiate the no-bias and no-collapse claims. revision: yes
-
Referee: §4.3 (Real data evaluation): training relies on synthetic haze pairs even for real microscopy test images, yet no domain-gap analysis or sensitivity study is reported; an unmodeled shift could silently bias the conditional mean while still producing plausible perceptual scores, undermining applicability claims for real data.
Authors: This concern is valid. Because paired real hazy-clean microscopy data are unavailable, a full quantitative sensitivity study is not feasible with the current resources. We will therefore expand §4.3 with an explicit discussion of the synthetic-to-real domain gap, qualitative comparison of feature distributions, and a clearer statement of the resulting limitations on real-data applicability. revision: partial
Circularity Check
No circularity: HazeMatching is a standard adaptation of conditional flow matching with empirical validation
full rationale
The paper adapts the existing conditional flow matching framework by incorporating guidance from the hazy observation directly into the conditional velocity field. This is presented as a methodological extension rather than a derivation that reduces to its own inputs by construction. No equations are shown that define a quantity in terms of itself or rename a fitted parameter as a prediction. Evaluations rely on comparisons across datasets and baselines plus calibration plots, which are external to the model definition. Any self-citations (if present) support background concepts but are not load-bearing for the central claim of balanced fidelity-realism, which rests on reported empirical results rather than tautological reduction. The absence of an explicit degradation operator is a stated advantage, not a circular assumption.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Guiding the conditional velocity field with the hazy observation produces outputs that are both faithful and perceptually realistic.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We achieve this by adapting the conditional flow matching framework by guiding the generative process with a hazy observation in the conditional velocity field.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
min_θ E ||vt_θ(xt, xM0) - (xM1 - x0)||²
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
URL https://proceedings.mlr.press/v235/albergo24a. html. Allen Institute for Brain Science. Allen Cell Types Database. URL https://celltypes.brain-map.org/. Accessed: 2025-05-04. Allen Institute for Cell Science. Allen Cell Explorer. URL https://www.allencell.org/. Accessed: 2025-05-04. Ashesh Ashesh, Alexander Krull, Moises Di Sante, Francesco Pasqualini...
work page 2025
-
[2]
URL https://arxiv.org/abs/2408.08747. 10 Ashesh Ashesh, Federico Carrara, Igor Zubarev, Vera Galinova, Melisande Croft, Melissa Pezzotti, Daozheng Gong, Francesca Casagrande, Elisa Colombo, Stefania Giussani, Elena Restelli, Eugenia Cammarota, Juan Manuel Battagliotti, Nikolai Klena, Moises Di Sante, Gaia Pigino, Elena Taverna, Oliver Harschnitz, Nicola M...
-
[3]
doi: 10.1101/2025. 02.10.637323. URL https://www.biorxiv.org/content/early/2025/02/11/2025.02.10.637323. Pete Bankhead. Noise · analyzing fluorescence microscopy images with imagej. https://petebankhead. gitbooks.io/imagej-intro/content/chapters/formation_noise/formation_noise.html. Ac- cessed: 2025-05-04. Arpit Bansal, Eitan Borgnia, Hong-Min Chu, Jie S....
-
[4]
URL https://arxiv.org/abs/2208.09392. Yochai Blau and Tomer Michaeli. The perception-distortion tradeoff. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018 , pages 6228–6237. Computer Vision Foundation / IEEE Computer Society,
-
[5]
doi: 10.1109/ CVPR.2018.00652. URL http://openaccess.thecvf.com/content_cvpr_2018/html/Blau_The_ Perception-Distortion_Tradeoff_CVPR_2018_paper.html. Mauricio Delbracio and Peyman Milanfar. Inversion by direct iteration: An alternative to denoising diffusion for image restoration. Trans. Mach. Learn. Res., 2023,
-
[6]
Diffusion Models Beat GANs on Image Synthesis
URL https: //arxiv.org/abs/2105.05233. Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger. On calibration of modern neural networks. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 1321–1330. PMLR,
work page internal anchor Pith review Pith/arXiv arXiv
-
[7]
neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf
URL https://proceedings. neurips.cc/paper_files/paper/2020/file/4c5bcfec8584af0d967f1ab10179ca4b-Paper.pdf. Bahjat Kawar, Michael Elad, Stefano Ermon, and Jiaming Song. Denoising diffusion restoration models,
work page 2020
-
[8]
URL https://arxiv.org/abs/2201.11793. Alexander Krull, Tim-Oliver Buchholz, and Florian Jug. Noise2void-learning denoising from single noisy images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages 2129–2137, 2019a. Alexander Krull, Tomas Vicar, and Florian Jug. Probabilistic noise2void: Unsupervised content-aware d...
-
[9]
Sangyun Lee, Beomsu Kim, and Jong Chul Ye
Accessed: 2025-05-04. Sangyun Lee, Beomsu Kim, and Jong Chul Ye. Minimizing trajectory curvature of ode-based generative models. arXiv preprint arXiv:2301.12003,
-
[10]
Obtain Maximum Information from Your Specimen with LIGHT- NING
Leica Microsystems. Obtain Maximum Information from Your Specimen with LIGHT- NING. https://www.leica-microsystems.com/science-lab/life-science/ obtain-maximum-information-from-your-specimen-with-lightning/ . Accessed: 2025-05-
work page 2025
-
[11]
URL https: //arxiv.org/abs/2412.06264. Yiming Liu, Spozmai Panezai, Yutong Wang, and Sjoerd Stallinga. Noise amplification and ill-convergence of richardson-lucy deconvolution. Nature Communications, 16, 01
work page internal anchor Pith review Pith/arXiv arXiv
-
[12]
Zhichao Liu, Luhong Jin, Jincheng Chen, Qiuyu Fang, Sergey Ablameyko, Zhaozheng Yin, and Yingke Xu
doi: 10.1038/s41467-025-56241-x. Zhichao Liu, Luhong Jin, Jincheng Chen, Qiuyu Fang, Sergey Ablameyko, Zhaozheng Yin, and Yingke Xu. A survey on applications of deep learning in microscopy image analysis. Computers in Biology and Medicine, 134:104523,
-
[13]
doi: 10.1016/j.compbiomed.2021.104523. L. B. Lucy. An iterative technique for the rectification of observed distributions. The Astronomical Journal, 79: 745,
-
[14]
doi: 10.1086/111605. Nikon Instruments Inc. NIS-Elements Imaging Software. https://www.microscope.healthcare.nikon. com/products/software/nis-elements. Accessed: 2025-05-04. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. ...
-
[15]
Hamamatsu Photonics. What is photon shot noise? https://camera.hamamatsu.com/jp/en/learn/ technical_information/thechnical_guide/photon_shot_noise.html. Accessed: 2025-05-04. Mangal Prakash, Mauricio Delbracio, Peyman Milanfar, and Florian Jug. Interpretable unsupervised diversity denoising and artefact removal. In International Conference on Learning Rep...
work page 2025
-
[16]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox
URL https://openreview.net/forum?id=DfMqlB0PXjM. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical im- age segmentation. In Nassir Navab, Joachim Hornegger, William M. Wells III, and Alejandro F. Frangi, editors, Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015 - 18th Interna- tional C...
work page 2015
-
[17]
URL https://doi.org/10.1007/978-3-319-24574-4_28
doi: 10.1007/978-3-319-24574-4\_28. URL https://doi.org/10.1007/978-3-319-24574-4_28 . Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, David J Fleet, and Mohammad Norouzi. Image super-resolution via iterative refinement. arXiv:2104.07636,
-
[18]
Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther
doi: 10.1038/s41580-024-00702-6. Casper Kaae Sønderby, Tapani Raiko, Lars Maaløe, Søren Kaae Sønderby, and Ole Winther. Lad- der variational autoencoders. In Daniel D. Lee, Masashi Sugiyama, Ulrike von Luxburg, Isabelle Guyon, and Roman Garnett, editors, Advances in Neural Information Processing Systems 29: An- nual Conference on Neural Information Proces...
-
[19]
Jiaming Song, Chenlin Meng, and Stefano Ermon
URL https://proceedings.neurips.cc/paper/2016/hash/ 6ae07dcb33ec3b7c814df797cbda0f87-Abstract.html. Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. In International Confer- ence on Learning Representations,
work page 2016
-
[20]
URL https://github.com/ atong01/conditional-flow-matching. Accessed: 2025-05-04. Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612,
work page 2025
-
[21]
URL https://www.biorxiv.org/ content/early/2018/07/03/236463
doi: 10.1101/236463. URL https://www.biorxiv.org/ content/early/2018/07/03/236463. Wufeng Xue, Lei Zhang, Xuanqin Mou, and Alan C. Bovik. Gradient magnitude similarity deviation: A highly efficient perceptual image quality index. IEEE Transactions on Image Processing, 23(2):684–695,
-
[22]
The figure illustrates the different distributions involved in our approach
12 Dehazing Light Microscopy Images with Guided Conditional Flow Matching: finding a sweet spot between fidelity and realism Supplementary Material A Overview of the H AZE MATCHING T raining Inference Figure S1: Overview of our training and inference. The figure illustrates the different distributions involved in our approach. The top panel shows how samp...
work page 2017
-
[23]
to generate paired data for the Zebrafish, Microtubule, and Neuron datasets. For Zebrafish and Microtubule, we start from clean confocal images and remove residual pixel-independent noise using Noise2Void Krull et al. [2019a]. We then simulate widefield- like counterparts using realistic physical pixel sizes and a confocal PSF with an open pinhole, incorp...
work page 2018
-
[24]
This allows evaluation of both dehazing and 16 denoising performance
Since this dataset is fully simulated, we can generate paired inputs and targets: the noisy, hazy image is used as input, and the clean (non-hazy, noise-free) image as target. This allows evaluation of both dehazing and 16 denoising performance. Owing to the simplicity of the synthetic structures, this dataset exhibits less inherent data uncertainty compa...
work page 2021
-
[25]
for light microscopy images) and MicroMS-SSIM (a light microscopy-specific variant of MS-SSIM introduced in (Ashesh et al. [2024]). For perceptual quality, we report LPIPS (Zhang et al. [2018]) and FID (Heusel et al. [2017]). LPIPS measures perceptual similarity via deep feature embeddings (from AlexNet), while FID computes the Fréchet distance between fe...
work page 2024
-
[26]
from the training sets for each of the datasets to construct a representative empirical distribution of real clean images. Importantly, this does not mean we evaluate model performance on training data; rather, the training patches are used solely to estimate the underlying distribution of real clean images. The predictions on the test set are then compar...
work page 1974
-
[27]
The Lightning software was from Leica Application Suite X (LAS X, Leica Microsystems) version 4.7.0.28176 and its integrated deconvolution module. The parameters used for the post-processing of the widefield images using Lightning are the following: strategy: adaptive; type: confocal; number of iterations: automatic; optimization: 0; contrast enhancement:...
work page 2017
-
[28]
For each dataset, we show MicroMS-SSIM vs
datasets. For each dataset, we show MicroMS-SSIM vs. LPIPS (left) and MicroMS- SSIM vs. FID (center), capturing the trade-off between pixel-level fidelity and structural/perceptual quality. The MicroMS-SSIM metric highlights the preservation of local structural details and consistency in the restored images, complementing pixel-wise fidelity metrics like ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.