pith. sign in

arxiv: 2604.03402 · v2 · submitted 2026-04-03 · 📡 eess.IV · cs.CV

DRIFT: Deep Restoration, ISP Fusion, and Tone-mapping

Pith reviewed 2026-05-13 18:13 UTC · model grok-4.3

classification 📡 eess.IV cs.CV
keywords image restorationtone mappingmulti-frame processingISP fusionmobile imagingdeep learningraw to RGB
0
0 comments X

The pith

DRIFT uses a multi-frame neural network and tunable tone-mapping to create high-quality RGB images from raw smartphone captures efficiently.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces DRIFT, an AI-powered mobile camera pipeline designed to process hand-held raw image captures into high-quality RGB outputs. The pipeline first employs a Multi-Frame Processing network trained with adversarial perceptual loss to perform alignment, denoising, demosaicing, and super-resolution in one step. It then applies a novel deep tone-mapping module called DRIFT-TM that offers tone tunability, ensures consistency with a reference pipeline, and operates efficiently on mobile hardware for high-resolution images. A reader might care because smartphone cameras increasingly rely on computational methods to handle high resolution and dynamic range while keeping power and compute low, and this approach claims to improve quality over existing methods.

Core claim

DRIFT is an efficient AI mobile camera pipeline with a Multi-Frame Processing network that uses adversarial perceptual loss for multi-frame alignment, denoising, demosaicing, and super-resolution, followed by a deep-learning tone-mapping solution that provides tunability and reference consistency.

What carries the argument

The Multi-Frame Processing (MFP) network trained with adversarial perceptual loss, followed by the DRIFT-TM tone-mapping network.

If this is right

  • High-resolution images can be generated from raw captures on mobile devices with reduced computational cost.
  • The tone-mapping allows adjustments while maintaining consistency across different scenes.
  • Performance exceeds state-of-the-art methods in both qualitative and quantitative evaluations for restoration tasks.
  • Overall pipeline enables better handling of high-dynamic range imaging in smartphones.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the method generalizes well, it could replace parts of traditional ISP hardware with learned models.
  • Extensions might include applying similar fusion to video sequences for temporal consistency.
  • Testing across multiple device models would reveal if tone consistency holds universally.

Load-bearing premise

That the networks trained on the authors' data and loss will perform well on arbitrary real-world handheld raw captures without losing tone consistency with the reference across all scenes and devices.

What would settle it

Running DRIFT on raw captures from a new smartphone model or challenging lighting condition and observing if the output matches or exceeds the reference pipeline in visual quality and tone.

Figures

Figures reproduced from arXiv: 2604.03402 by Abhinau K. Venkataramanan, Hamid Rahim Sheikh, Joshua Peter Ebenezer, Seok-Jun Lee, Soumendu Majee, Sreenithy Chandran, Thilo Balke, Weidi Liu, Zeeshan Nadir.

Figure 1
Figure 1. Figure 1: Overview of the proposed Drift Pipeline. In the first part of DRIFT, DRIFT-MFP performs deep restoration of the the multi-frame [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the training and inference pipelines for [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of the training and inference pipelines for [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: DRIFT Tone-map network architecture. We incorpo [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Denoising results across various scenes. For each scene (row), the columns correspond to: (a) The low-quality input image, (b) [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: 4x SR results across various scenes. NAFNET trained with LPIPS achieves the best FID score but produces artifacts as seen in [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Non-reference tone-mapping methods comparisons. Our [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Visual comparison with state-of-the-art supervised learn [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗
Figure 10
Figure 10. Figure 10: Visual comparison for ablation study (top: image, [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Visual comparison showing incorrect metadata at infer [PITH_FULL_IMAGE:figures/full_fig_p008_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Illustration of our method’s tunability: left shows con [PITH_FULL_IMAGE:figures/full_fig_p008_12.png] view at source ↗
read the original abstract

Smartphone cameras have gained immense popularity with the adoption of high-resolution and high-dynamic range imaging. As a result, high-performance camera Image Signal Processors (ISPs) are crucial in generating high-quality images for the end user while keeping computational costs low. In this paper, we propose DRIFT (Deep Restoration, ISP Fusion, and Tone-mapping): an efficient AI mobile camera pipeline that generates high quality RGB images from hand-held raw captures. The first stage of DRIFT is a Multi-Frame Processing (MFP) network that is trained using a adversarial perceptual loss to perform multi-frame alignment, denoising, demosaicing, and super-resolution. Then, the output of DRIFT-MFP is processed by a novel deep-learning based tone-mapping (DRIFT-TM) solution that allows for tone tunability, ensures tone-consistency with a reference pipeline, and can be run efficiently for high-resolution images on a mobile device. We show qualitative and quantitative comparisons against state-of-the-art MFP and tone-mapping methods to demonstrate the effectiveness of our approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes DRIFT, an efficient AI mobile camera pipeline that generates high-quality RGB images from handheld raw captures. It consists of a Multi-Frame Processing (MFP) network trained with adversarial perceptual loss to perform multi-frame alignment, denoising, demosaicing, and super-resolution, followed by a novel deep tone-mapping module (DRIFT-TM) that provides tone tunability, reference-pipeline consistency, and mobile efficiency for high-resolution images. Qualitative and quantitative comparisons against state-of-the-art MFP and tone-mapping methods are presented to demonstrate effectiveness.

Significance. If the central claims hold with proper experimental validation, DRIFT could offer a practical integrated deep-learning solution for smartphone ISPs, combining restoration tasks in a single efficient network while maintaining tone consistency and tunability, which would be valuable for real-world mobile imaging applications.

major comments (2)
  1. [Abstract] Abstract: the claim of quantitative superiority is asserted without any reported metrics, datasets, error bars, ablation results, or cross-device evaluations, leaving the central claim of effectiveness and generalization unsupported by evidence in the manuscript description.
  2. [Abstract] Abstract: the assumption that a single MFP network trained on unspecified handheld raw data with adversarial perceptual loss will generalize to arbitrary real-world captures (varying sensors, motion, and ISP differences) while DRIFT-TM preserves tone consistency is load-bearing but lacks supporting cross-device test sets or quantitative tone-deviation measures.
minor comments (1)
  1. Define all acronyms (e.g., MFP, ISP, DRIFT-TM) on first use and ensure consistent notation throughout the full manuscript.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We agree that the abstract requires strengthening with concrete evidence and have revised it accordingly. Below we respond point-by-point to the major comments.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim of quantitative superiority is asserted without any reported metrics, datasets, error bars, ablation results, or cross-device evaluations, leaving the central claim of effectiveness and generalization unsupported by evidence in the manuscript description.

    Authors: We agree that the abstract should include specific quantitative support. In the revised version we have updated the abstract to report key metrics (PSNR, SSIM, LPIPS) from the MFP and tone-mapping comparisons, name the primary evaluation datasets, and reference the ablation studies and error-bar analysis that appear in Sections 4.2–4.3. Full cross-device results and additional ablations are retained in the main text and supplementary material. revision: yes

  2. Referee: [Abstract] Abstract: the assumption that a single MFP network trained on unspecified handheld raw data with adversarial perceptual loss will generalize to arbitrary real-world captures (varying sensors, motion, and ISP differences) while DRIFT-TM preserves tone consistency is load-bearing but lacks supporting cross-device test sets or quantitative tone-deviation measures.

    Authors: Section 3.1 describes the training set as multi-device handheld raw bursts collected from several smartphone sensors under varied motion and lighting. Generalization is quantified on held-out test bursts exhibiting different motion magnitudes and dynamic ranges; DRIFT-TM tone consistency is measured via mean ΔE and histogram-correlation scores against the reference pipeline (Table 3). We acknowledge that exhaustive coverage of every sensor/ISP combination is impractical. The revised manuscript adds results from one additional unseen device and a limitations paragraph discussing remaining generalization gaps. revision: partial

Circularity Check

0 steps flagged

No circularity: standard supervised training on external data with no self-referential reductions

full rationale

The paper describes a two-stage pipeline (MFP network trained via adversarial perceptual loss for alignment/denoising/demosaicing/super-resolution, followed by DRIFT-TM for tunable tone-mapping). No equations, derivations, or fitted parameters are presented that reduce to the inputs by construction. Training is described as occurring on external handheld raw data with standard losses; no self-citation chains, uniqueness theorems, or ansatzes are invoked to justify the central claims. Generalization concerns exist but are unrelated to circularity. The derivation chain is self-contained and non-circular.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 2 invented entities

As a deep-learning method paper the claim rests on the effectiveness of two new networks and a perceptual adversarial loss; these involve many fitted parameters and domain assumptions about generalization.

free parameters (2)
  • adversarial loss weight
    Scaling factor balancing the perceptual adversarial term against other losses during MFP training
  • network hyperparameters
    Architecture depth, channel counts, and learning-rate schedules fitted during end-to-end training
axioms (1)
  • domain assumption Adversarial perceptual loss produces images preferred by human viewers
    Invoked to justify the training objective for the MFP stage
invented entities (2)
  • DRIFT-MFP no independent evidence
    purpose: Performs multi-frame alignment, denoising, demosaicing and super-resolution
    New network component introduced by the paper
  • DRIFT-TM no independent evidence
    purpose: Provides tunable, reference-consistent tone-mapping runnable on mobile hardware
    New tone-mapping module introduced by the paper

pith-pipeline@v0.9.0 · 5524 in / 1279 out tokens · 47715 ms · 2026-05-13T18:13:28.748237+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages · 1 internal anchor

  1. [1]

    Renoir – a dataset for real low-light image noise reduction.Journal of Visual Commu- nication and Image Representation, 51:144–154, 2018

    Josue Anaya and Adrian Barbu. Renoir – a dataset for real low-light image noise reduction.Journal of Visual Commu- nication and Image Representation, 51:144–154, 2018. 4

  2. [2]

    Deep burst super-resolution

    Goutam Bhat, Martin Danelljan, Luc Van Gool, and Radu Timofte. Deep burst super-resolution. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9209–9218, 2021. 4

  3. [3]

    A multiresolution spline with application to image mosaics.ACM Transactions on Graphics (ToG), 2(4):217–236, 1983

    Peter J Burt and Edward H Adelson. A multiresolution spline with application to image mosaics.ACM Transactions on Graphics (ToG), 2(4):217–236, 1983. 4

  4. [4]

    Which tone-mapping operator is the best? a com- parative study of perceptual quality.Journal of the Optical Society of America A, 35(4):626–638, 2018

    Xim Cerda-Company, C Alejandro Parraga, and Xavier Otazu. Which tone-mapping operator is the best? a com- parative study of perceptual quality.Journal of the Optical Society of America A, 35(4):626–638, 2018. 3

  5. [5]

    Simple Baselines for Image Restoration

    Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. Simple Baselines for Image Restoration. InComputer Vision – ECCV 2022, pages 17–33. Springer Nature Switzerland, Cham, 2022. Series Title: Lecture Notes in Computer Sci- ence. 2, 3

  6. [6]

    Ntire 2025 challenge on raw im- age restoration and super-resolution

    Marcos Conde et al. Ntire 2025 challenge on raw im- age restoration and super-resolution. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 1148–1171, 2025. 2, 3

  7. [7]

    Mobile computational photography: A tour.Annual review of vision science, 7(1):571–604, 2021

    Mauricio Delbracio, Damien Kelly, Michael S Brown, and Peyman Milanfar. Mobile computational photography: A tour.Annual review of vision science, 7(1):571–604, 2021. 1, 3

  8. [8]

    Imagenet: A large-scale hierarchical image database

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009. 2

  9. [9]

    Burst image restoration and enhancement

    Akshay Dudhane, Syed Waqas Zamir, Salman Khan, Fa- had Shahbaz Khan, and Ming-Hsuan Yang. Burst image restoration and enhancement. InProceedings of the ieee/cvf Conference on Computer Vision and Pattern Recognition, pages 5759–5768, 2022. 2, 6

  10. [10]

    Burstormer: Burst image restoration and enhancement transformer

    Akshay Dudhane, Syed Waqas Zamir, Salman Khan, Fa- had Shahbaz Khan, and Ming-Hsuan Yang. Burstormer: Burst image restoration and enhancement transformer. In 2023 IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 5703–5712. IEEE, 2023. 2, 6

  11. [11]

    Ntire 2025 challenge on night photogra- phy rendering

    Egor Ershov et al. Ntire 2025 challenge on night photogra- phy rendering. In2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 1505–1515, 2025. 3

  12. [12]

    Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography.Communications of the ACM, 24(6):381–395, 1981

    Martin A Fischler and Robert C Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography.Communications of the ACM, 24(6):381–395, 1981. 4

  13. [13]

    Edge-aware deep image deblurring

    Zhichao Fu, Yingbin Zheng, Tianlong Ma, Hao Ye, Jing Yang, and Liang He. Edge-aware deep image deblurring. Neurocomputing, 502:37–47, 2022. 3

  14. [14]

    Eduardo S. L. Gastal and Manuel M. Oliveira. Domain transform for edge-aware image and video processing.ACM Trans. Graph., 30(4), 2011. 3

  15. [15]

    Deep learning, 2016

    Ian Goodfellow. Deep learning, 2016. 2

  16. [16]

    Generative adversarial nets.Advances in neural information processing systems, 27, 2014

    Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets.Advances in neural information processing systems, 27, 2014. 2

  17. [17]

    Deep tone-mapping opera- tor using image quality assessment inspired semi-supervised learning.IEEE Access, 9:73873–73889, 2021

    Cheng Guo and Xiuhua Jiang. Deep tone-mapping opera- tor using image quality assessment inspired semi-supervised learning.IEEE Access, 9:73873–73889, 2021. 3, 7

  18. [18]

    Image-to-image translation with conditional adver- sarial networks

    Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation with conditional adver- sarial networks. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134,

  19. [19]

    Perceptual losses for real-time style transfer and super-resolution

    Justin Johnson, Alexandre Alahi, and Li Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision, pages 694–711. Springer, 2016. 2

  20. [20]

    Mfsr-gan: Multi-frame super-resolution with handheld motion modeling

    Fadeel Sher Khan, Joshua Ebenezer, Hamid Sheikh, and Seok-Jun Lee. Mfsr-gan: Multi-frame super-resolution with handheld motion modeling. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 800–809,

  21. [21]

    Convolutional neural net- works considering local and global features for image en- hancement

    Yuma Kinoshita and Hitoshi Kiya. Convolutional neural net- works considering local and global features for image en- hancement. In2019 IEEE International Conference on Im- age Processing (ICIP), pages 2110–2114. IEEE, 2019. 3

  22. [22]

    Artifact generation when using perceptual loss for image deblurring.TechRxiv, 2023

    Patrick Krawczyk, Marvin Gaertner, Andreas Jansche, Timo Bernthaler, and Gerhard Schneider. Artifact generation when using perceptual loss for image deblurring.TechRxiv, 2023. 3

  23. [23]

    Ntire 2025 challenge on efficient burst hdr and restoration: Datasets, methods, and results

    Sangmin Lee et al. Ntire 2025 challenge on efficient burst hdr and restoration: Datasets, methods, and results. InPro- ceedings of the Computer Vision and Pattern Recognition Conference, pages 1002–1017, 2025. 2

  24. [24]

    Hy- brid synthesis for exposure fusion from hand-held camera inputs

    Ru Li, Shuaicheng Liu, Guanghui Liu, and Bing Zeng. Hy- brid synthesis for exposure fusion from hand-held camera inputs. In2019 IEEE International Conference on Image Processing (ICIP), pages 4639–4643. IEEE, 2019. 5

  25. [25]

    High dynamic range image compression by optimizing tone mapped image quality index.IEEE Transactions on Image Processing, 24(10):3086–3097, 2015

    Kede Ma, Hojatollah Yeganeh, Kai Zeng, and Zhou Wang. High dynamic range image compression by optimizing tone mapped image quality index.IEEE Transactions on Image Processing, 24(10):3086–3097, 2015. 2, 3

  26. [26]

    Mobile aware denoiser net- work (madnet) for quad bayer images

    Pavan C Madhusudana, Jing Li, Zeeshan Nadir, Hamid R Sheikh, and Seok-Jun Lee. Mobile aware denoiser net- work (madnet) for quad bayer images. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 44–52, 2024. 4

  27. [27]

    Display adaptive tone mapping.ACM Trans

    Rafał Mantiuk, Scott Daly, and Louis Kerofsky. Display adaptive tone mapping.ACM Trans. Graph., 27(3):1–10,

  28. [28]

    Exposure fusion

    Tom Mertens, Jan Kautz, and Frank Van Reeth. Exposure fusion. In15th Pacific Conference on Computer Graphics and Applications (PG’07), pages 382–390. IEEE, 2007. 3, 5

  29. [29]

    Exposure fusion: A simple and practical alternative to high dynamic range photography

    Tom Mertens, Jan Kautz, and Frank Van Reeth. Exposure fusion: A simple and practical alternative to high dynamic range photography. InComputer graphics forum, pages 161–

  30. [30]

    Wiley Online Library, 2009. 4 9

  31. [31]

    Assessment of video tone-mapping: Are cameras’ s-shaped tone-curves good enough?Journal of Visual Communication and Image Rep- resentation, 24(7):1020–1030, 2013

    Josselin Petit and Rafał K Mantiuk. Assessment of video tone-mapping: Are cameras’ s-shaped tone-curves good enough?Journal of Visual Communication and Image Rep- resentation, 24(7):1020–1030, 2013. 3

  32. [32]

    A systematic performance analysis of deep perceptual loss networks: Breaking transfer learning conventions.arXiv preprint arXiv:2302.04032, 2023

    Gustav Grund Pihlgren, Konstantina Nikolaidou, Prakash Chandra Chhipa, Nosheen Abid, Rajkumar Saini, Fredrik Sandin, and Marcus Liwicki. A systematic performance analysis of deep perceptual loss networks: Breaking transfer learning conventions.arXiv preprint arXiv:2302.04032, 2023. 3

  33. [33]

    Nikolay Ponomarenko, Lina Jin, Oleg Ieremeiev, Vladimir Lukin, Karen Egiazarian, Jaakko Astola, Benoit V ozel, Kacem Chehdi, Marco Carli, Federica Battisti, and C.-C. Jay Kuo. Image database TID2013: Peculiarities, results and perspectives.Signal Processing: Image Communication, 30: 57–77, 2015. 4

  34. [34]

    Qualcomm AI Runtime SDK

    Qualcomm. Qualcomm AI Runtime SDK. https : / / docs . qualcomm . com / bundle / publicresource / topics / 80 - 63442 - 10 / SNPE _ general _ revision _ history . html,,

  35. [35]

    Accessed: 2025-11-13. 4

  36. [36]

    Deep tone mapping operator for high dynamic range images.IEEE Transactions on Image Processing, 29:1285–1298, 2019

    Aakanksha Rana, Praveer Singh, Giuseppe Valenzise, Fred- eric Dufaux, Nikos Komodakis, and Aljosa Smolic. Deep tone mapping operator for high dynamic range images.IEEE Transactions on Image Processing, 29:1285–1298, 2019. 3

  37. [37]

    High dynamic range imaging

    Erik Reinhard. High dynamic range imaging. InComputer Vision: A Reference Guide, pages 558–563. Springer, 2021. 3

  38. [38]

    Improved techniques for training gans.Advances in neural information processing systems, 29, 2016

    Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training gans.Advances in neural information processing systems, 29, 2016. 3, 4

  39. [39]

    Very Deep Convolutional Networks for Large-Scale Image Recognition

    Karen Simonyan and Andrew Zisserman. Very deep convo- lutional networks for large-scale image recognition.arXiv preprint arXiv:1409.1556, 2014. 2

  40. [40]

    Multi-frame image denoising and stabilization

    Marius Tico. Multi-frame image denoising and stabilization. In2008 16th European Signal Processing Conference, pages 1–4. IEEE, 2008. 2

  41. [41]

    Learning a self-supervised tone mapping operator via feature contrast masking loss

    Chao Wang, Bin Chen, Hans-Peter Seidel, Karol Myszkowski, and Ana Serrano. Learning a self-supervised tone mapping operator via feature contrast masking loss. InComputer Graphics Forum, pages 71–84. Wiley Online Library, 2022. 3, 7

  42. [42]

    High-resolution image syn- thesis and semantic manipulation with conditional gans

    Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. High-resolution image syn- thesis and semantic manipulation with conditional gans. In 2018 IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pages 8798–8807, 2018. 3, 4

  43. [43]

    Esrgan: En- hanced super-resolution generative adversarial networks

    Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. Esrgan: En- hanced super-resolution generative adversarial networks. In Proceedings of the European conference on computer vision (ECCV) workshops, pages 0–0, 2018. 4

  44. [44]

    Towards real- world burst image super-resolution: Benchmark and method

    Pengxu Wei, Yujing Sun, Xingbei Guo, Chang Liu, Guanbin Li, Jie Chen, Xiangyang Ji, and Liang Lin. Towards real- world burst image super-resolution: Benchmark and method. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 13233–13242, 2023. 6

  45. [45]

    Handheld multi-frame super- resolution.ACM Transactions on Graphics (ToG), 38(4):1– 18, 2019

    Bartlomiej Wronski, Ignacio Garcia-Dorado, Manfred Ernst, Damien Kelly, Michael Krainin, Chia-Kai Liang, Marc Levoy, and Peyman Milanfar. Handheld multi-frame super- resolution.ACM Transactions on Graphics (ToG), 38(4):1– 18, 2019. 1, 2

  46. [46]

    Multi-Exposure Image Fusion Techniques: A Com- prehensive Review.Remote Sensing, 14(3):771, 2022

    Fang Xu, Jinghong Liu, Yueming Song, Hui Sun, and Xuan Wang. Multi-Exposure Image Fusion Techniques: A Com- prehensive Review.Remote Sensing, 14(3):771, 2022. Num- ber: 3 Publisher: Multidisciplinary Digital Publishing Insti- tute. 4

  47. [47]

    Learning differential pyramid representation for tone map- ping.arXiv preprint arXiv:2412.01463, 2024

    Qirui Yang, Yinbo Li, Yihao Liu, Peng-Tao Jiang, Fangpu Zhang, Qihua Cheng, Huanjing Yue, and Jingyu Yang. Learning differential pyramid representation for tone map- ping.arXiv preprint arXiv:2412.01463, 2024. 3

  48. [48]

    High dy- namic range image tone mapping based on variational image decomposition and color correction.Optics & Laser Tech- nology, 181:111873, 2025

    Xuejie Yang, Huamiao Zheng, and Yonggang Su. High dy- namic range image tone mapping based on variational image decomposition and color correction.Optics & Laser Tech- nology, 181:111873, 2025. 5

  49. [49]

    Objective quality as- sessment of tone-mapped images.IEEE Transactions on Im- age Processing, 22(2):657–667, 2013

    Hojatollah Yeganeh and Zhou Wang. Objective quality as- sessment of tone-mapped images.IEEE Transactions on Im- age Processing, 22(2):657–667, 2013. 7

  50. [50]

    Multi-stage progressive image restoration

    Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Multi-stage progressive image restoration. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14821–14831, 2021. 6

  51. [51]

    Restormer: Efficient transformer for high-resolution image restoration

    Syed Waqas Zamir, Aditya Arora, Salman Khan, Mu- nawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image restoration. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5728–5739, 2022. 2, 6

  52. [52]

    Lookup table meets lo- cal laplacian filter: pyramid reconstruction network for tone mapping.Advances in Neural Information Processing Sys- tems, 36:57558–57569, 2023

    Feng Zhang, Ming Tian, Zhiqiang Li, Bin Xu, Qingbo Lu, Changxin Gao, and Nong Sang. Lookup table meets lo- cal laplacian filter: pyramid reconstruction network for tone mapping.Advances in Neural Information Processing Sys- tems, 36:57558–57569, 2023. 3

  53. [53]

    High-resolution photo enhancement in real-time: A laplacian pyramid network.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

    Feng Zhang, Haoyou Deng, Zhiqiang Li, Lida Li, Bin Xu, Qingbo Lu, Zisheng Cao, Minchen Wei, Changxin Gao, Nong Sang, et al. High-resolution photo enhancement in real-time: A laplacian pyramid network.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. 3, 8

  54. [54]

    A generative adversarial network based tone mapping operator for 4k hdr images

    Junbin Zhang, Yixiao Wang, Hamidreza Tohidypour, Mahsa T Pourazad, and Panos Nasiopoulos. A generative adversarial network based tone mapping operator for 4k hdr images. In2023 international conference on computing, networking and communications (ICNC), pages 473–477. IEEE, 2023. 3, 7, 8

  55. [55]

    The unreasonable effectiveness of deep features as a perceptual metric

    Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shecht- man, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InProceedings of the IEEE conference on computer vision and pattern recogni- tion, pages 586–595, 2018. 3 10