pith. sign in

arxiv: 2512.15905 · v4 · submitted 2025-12-17 · 📡 eess.IV

SNIC: Synthesized Noisy Images using Calibration

Pith reviewed 2026-05-16 21:05 UTC · model grok-4.3

classification 📡 eess.IV
keywords noise synthesisheteroscedastic noiseRAW image denoisingcalibration pipelineSNIC datasetdark framessensor noise modelingdenoising training data
0
0 comments X p. Extension

The pith

A dark-frame calibration pipeline for heteroscedastic noise models generates synthesized RAW images that reduce the PSNR gap to real noise by 54-64 percent compared to manufacturer profiles.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a calibration method for noise models in cameras that uses dark frames to better capture signal-independent noise after ISP processing. This allows creation of synthetic noisy images that more closely match real-world noise distributions than standard manufacturer noise profiles. The resulting SNIC dataset provides over 6600 paired images from multiple sensors for training denoising models. If accurate, this approach could improve the performance of denoisers trained on synthetic data by making them generalize better to actual captured images.

Core claim

By incorporating dark frames into a rigorous calibration and tuning pipeline for heteroscedastic noise models across various sensors, the synthesized noisy RAW images achieve a 54-64% reduction in the PSNR gap to real-world noise when evaluated with a state-of-the-art denoiser, compared to images synthesized using manufacturer-provided noise profiles that do not account for smartphone ISP noise suppression.

What carries the argument

The calibration and tuning pipeline that uses dark frames to capture signal-independent noise components in heteroscedastic noise models for different sensors including DSLR, point-and-shoot, and smartphone.

Load-bearing premise

The dark-frame-based calibration accurately captures the effective noise distribution after smartphone ISP processing and the PSNR improvements generalize beyond the tested denoiser and scenes.

What would settle it

Measuring the PSNR of a different state-of-the-art denoiser on the synthesized images versus real noisy images and finding the gap reduction falls below 50 percent would challenge the claim.

Figures

Figures reproduced from arXiv: 2512.15905 by Nik Bhatt.

Figure 1
Figure 1. Figure 1: Comparison of real camera noise (Sony A7R III) at base and high ISO versus [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of DNG Profiles. Note: Y-axis scales differ between cameras due to [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: DNG NoiseProfile scaling term (a) for iPhone 11 Pro and iPhone 15 Pro across [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Crops of calibration and tuning iPhone 11 Pro images shot at the same ISO and [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Flat-field crop (512x512 pixels) for Sony A7R III showing low variance in pixel [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: PTC for iPhone 11 Pro main camera 1 second. We took care to disable any in-camera noise reduction to get accurate dark frame images. Those dark frame images are used in the synthesis process as described below. 3.5 Tuning the Noise Model After observing non-linearity in the initial noise models for the Sony RX100 IV and the iPhone, we confirmed the issue by visually comparing real noisy images with calibra… view at source ↗
Figure 7
Figure 7. Figure 7: DNG, Calibrated, and Tuned Noise Models For each target noisy ISO, we load the calibrated per-channel (and per-ISO) noise model and, optionally, the list of dark frame images. Not all cameras benefit from dark current synthesis. Specifically, we found the Sony RX100 IV and Sony A7R III noise models were extremely accurate and additional dark current worsened the synthesis quality. For the iPhone cameras, i… view at source ↗
Figure 8
Figure 8. Figure 8: Comparison of noise at ISO 1600 for the iPhone 11 Pro telephoto camera [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: iPhone 11 Pro Metrics for Real Noisy Image Pairs [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: iPhone 11 Pro main camera 4.3 NAFNet Denoising Results The Nonlinear Activation Free Network (NAFNet) (Chen et al., 2022) is a denoising model that, despite its simplicity, achieves state-of-the-art results. We used their model pretrained on the real-noise SIDD dataset (NAFNet-SIDD-width64) (Abdelhamed et al., 2018). We 14 [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: LPIPS measurements for all cameras 19 [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Comparison of noise at ISO 800 for the iPhone 11 Pro main camera [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Comparison of noise at ISO 12800 for the Sony RX100 IV [PITH_FULL_IMAGE:figures/full_fig_p020_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Comparison of noise at ISO 6400 for the Sony A7R III [PITH_FULL_IMAGE:figures/full_fig_p020_14.png] view at source ↗
read the original abstract

Training advanced denoising models requires large datasets of high-fidelity, physically accurate images. While heteroscedastic noise models can simulate realistic noise, methodologies for their calibration remain under-explored, and large-scale calibrated datasets are scarce. We present a rigorous calibration and tuning pipeline for building high-quality heteroscedastic noise models across a range of sensors, incorporating dark frames to capture signal-independent noise. When evaluated with a state-of-the-art denoiser, our synthesized noisy RAW images reduce the Peak Signal to Noise Ratio (PSNR) gap to real-world noise by 54-64% compared to synthesized RAW images created using manufacturer-provided noise profiles, which fail to account for smart-phone ISP processing that suppresses noise in RAW files during calibration. Leveraging our pipeline, we introduce the Synthesized Noisy Images using Calibration (SNIC) dataset: over 6600 images across 30 scenes and four sensors (DSLR, point-and-shoot, and smartphone), with open-source calibration code and noise models. To our knowledge, SNIC is the only publicly available dataset with calibrated synthesized noise providing paired RAW and TIFF data, offering a new resource for researchers developing noise reduction models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper presents a calibration pipeline for heteroscedastic noise models that uses dark frames to capture signal-independent noise components suppressed by smartphone ISPs. It introduces the SNIC dataset of over 6600 synthesized noisy RAW images across 30 scenes and four sensors, with paired TIFF data and open-source calibration code. The central claim is that SNIC-synthesized images reduce the PSNR gap to real noisy images by 54-64% relative to manufacturer-provided profiles when training a state-of-the-art denoiser.

Significance. If the calibration pipeline is accurate and the reported improvement generalizes, the work supplies a reproducible resource for generating training data for denoising models, particularly for sensors where ISP processing alters effective noise. The open release of the dataset, paired RAW/TIFF pairs, and calibration code is a concrete strength that supports reproducibility.

major comments (2)
  1. [Experimental Results] The 54-64% PSNR gap reduction (abstract and experimental results) is evaluated exclusively with one state-of-the-art denoiser. No ablations across denoiser families (CNN vs. transformer vs. classical), no per-scene variance, and no statistical tests are reported to confirm the gain is driven by the calibration rather than denoiser-specific sensitivity. This is load-bearing for the headline claim that the synthesized images produce generally superior training data.
  2. [Experimental Results] The evaluation protocol, baseline details, and error analysis are insufficiently specified, preventing verification of whether post-hoc choices or limited testing affect the 54-64% figure (as reflected in the low soundness rating).
minor comments (1)
  1. [Abstract] The abstract refers to 'a state-of-the-art denoiser' without naming the model or architecture; this detail should be stated explicitly for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the major comments point-by-point below, agreeing where additional details are warranted and providing justification for our experimental choices. We will incorporate clarifications and supplementary analyses in the revised version to improve verifiability.

read point-by-point responses
  1. Referee: [Experimental Results] The 54-64% PSNR gap reduction (abstract and experimental results) is evaluated exclusively with one state-of-the-art denoiser. No ablations across denoiser families (CNN vs. transformer vs. classical), no per-scene variance, and no statistical tests are reported to confirm the gain is driven by the calibration rather than denoiser-specific sensitivity. This is load-bearing for the headline claim that the synthesized images produce generally superior training data.

    Authors: We selected a representative state-of-the-art denoiser to evaluate the practical utility of the SNIC calibration pipeline, as our core claim concerns the improvement in training data quality for such models relative to manufacturer profiles. The headline result is framed specifically around this evaluation rather than universality across all possible denoisers. To address the concern about consistency, we will add per-scene PSNR variance reporting and statistical significance tests (e.g., paired t-tests across scenes) in the revised manuscript. Comprehensive ablations across CNN, transformer, and classical families fall outside the primary scope of demonstrating the calibration method and releasing the dataset; we maintain that the reported gains support the value of SNIC as training data without requiring exhaustive cross-architecture validation. revision: partial

  2. Referee: [Experimental Results] The evaluation protocol, baseline details, and error analysis are insufficiently specified, preventing verification of whether post-hoc choices or limited testing affect the 54-64% figure (as reflected in the low soundness rating).

    Authors: We agree that the current manuscript would benefit from expanded specification of the evaluation protocol. In the revision, we will include detailed descriptions of the denoiser architecture and training hyperparameters, exact baseline implementation steps, the precise method for computing the PSNR gap to real noise, and any error or variance analysis performed. These additions will enable full reproduction and verification of the 54-64% figure. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical PSNR gap reduction measured against real data

full rationale

The paper's central result is an empirical comparison: SNIC-synthesized noisy RAW images, produced via a dark-frame calibration pipeline, are fed to one denoiser and yield a 54-64% smaller PSNR gap to real noisy images than manufacturer-profile syntheses. This metric is computed directly from held-out real captures and is not algebraically or statistically forced by the calibration parameters. No equations define the reported improvement in terms of the fitted noise model; no self-citation supplies a uniqueness theorem or ansatz; the calibration itself uses independent dark-frame measurements rather than the target PSNR quantity. The derivation chain therefore remains self-contained and externally falsifiable.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach rests on the standard assumption that heteroscedastic noise models adequately describe RAW sensor noise and that dark frames provide sufficient signal-independent calibration data; no new entities are postulated.

free parameters (1)
  • heteroscedastic noise parameters
    Parameters of the noise model are calibrated and tuned per sensor using dark frames and the described pipeline.
axioms (1)
  • domain assumption Heteroscedastic noise model is appropriate for RAW sensor data
    Invoked to justify the synthesis approach across DSLR, point-and-shoot, and smartphone sensors.

pith-pipeline@v0.9.0 · 5494 in / 1303 out tokens · 36517 ms · 2026-05-16T21:05:02.789253+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages · 3 internal anchors

  1. [1]

    Zhang, H

    [ 1 ] Y. Zhang, H. Qin, X. Wang, and H. Li, ``Rethinking noise synthesis and modeling in raw denoising,'' in Proceedings of the IEEE/CVF international conference on computer vision (ICCV), 2021, pp. 4593--4601

  2. [2]

    Abdelrahman and B

    [ 2 ] A. Abdelrahman and B. M. A. A. B. M. S., ``Noise flow: Noise modeling with conditional normalizing flows,'' in Proceedings of the IEEE/CVF international conference on computer vision (ICCV), 2019, pp. 3165--3173

  3. [3]

    Adobe, 2023

    [ 3 ] Adobe Systems Incorporated, Digital negative (DNG) specification. Adobe, 2023. Available: https://helpx.adobe.com/photoshop/digital-negative.html

  4. [4]

    [ 4 ] G. E. Healey and R. Kondepudy, ``Radiometric CCD camera calibration and noise estimation,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, no. 3, pp. 267--276, 1994, doi: 10.1109/34.273734 https://doi.org/10.1109/34.273734

  5. [5]

    [ 5 ] C. Chen, Q. Chen, J. Xu, and V. Koltun, ``Learning to see in the dark,'' in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2018, pp. 3291--3300

  6. [6]

    Abdelhamed, S

    [ 6 ] A. Abdelhamed, S. Lin, and M. S. Brown, ``A high-quality denoising dataset for smartphone cameras,'' in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2018, pp. 1692--1700

  7. [7]

    One family, six distributions -- A flexible model for insurance claim severity

    [ 7 ] J. Anaya and A. Ortiz, ``RENOIR: A dataset for real low-light image noise reduction,'' arXiv preprint arXiv:1805.10854, 2018

  8. [8]

    Plötz and S

    [ 8 ] T. Plötz and S. Roth, ``Benchmarking denoising algorithms with real photographs,'' Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1586--1595, 2017

  9. [9]

    [ 9 ] K. Wei, Y. Fu, J. Yang, Z. Ying, Y. Gao, and H. Huang, ``A physics-based noise formation model for extreme low-light raw denoising,'' in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2020, pp. 10011--10020

  10. [10]

    Flepp, Y

    [ 10 ] B. Flepp, Y. Mei, Z. Xia, Z. Xia, Z. Xia, et al. , ``Real-world mobile image denoising dataset with efficient baselines,'' in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2024, pp. 19635--19645. Available: https://openaccess.thecvf.com/content/CVPR2024/papers/Flepp_Real-World_Mobile_Image_Denoising_Datas...

  11. [11]

    [ 11 ] J. Xu, H. Li, Z. Liang, and D. Zhang, ``Real-world noisy image denoising: A new benchmark,'' arXiv preprint arXiv:1804.02603, 2018, Available: https://arxiv.org/abs/1804.02603

  12. [12]

    [ 12 ] L. Chen, X. Chu, X. Zhang, and J. Sun, ``Simple baselines for image restoration,'' in Computer vision -- ECCV 2022: 17th european conference, tel aviv, israel, october 23--27, 2022, proceedings, part VII, 2022, pp. 17--33. doi: 10.1007/978-3-031-20071-7\_2 https://doi.org/10.1007/978-3-031-20071-7_2

  13. [13]

    The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

    [ 13 ] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, ``The unreasonable effectiveness of deep features as a perceptual metric,'' in Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2018. doi: 10.48550/arXiv.1801.03924 https://doi.org/10.48550/arXiv.1801.03924

  14. [14]

    and Sheikh, H.R

    [ 14 ] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, ``Image quality assessment: From error visibility to structural similarity,'' IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600--612, 2004, doi: 10.1109/TIP.2003.819861 https://doi.org/10.1109/TIP.2003.819861. CSLReferences Appendix appendix Following are additional plots of LPI...