UniBlendNet: Unified Global, Multi-Scale, and Region-Adaptive Modeling for Ambient Lighting Normalization
Pith reviewed 2026-05-10 13:51 UTC · model grok-4.3
The pith
UniBlendNet combines long-range global modeling, pyramid multi-scale aggregation, and mask-guided local refinement to normalize ambient lighting in degraded images.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
UniBlendNet jointly models global illumination by integrating a UniConvNet-based module for long-range dependencies, handles complex variations via a Scale-Aware Aggregation Module that performs pyramid-based multi-scale feature aggregation with dynamic reweighting, and enables region-adaptive correction through a mask-guided residual refinement mechanism, leading to improved illumination consistency and structural fidelity on the NTIRE benchmark compared to the IFBlend baseline.
What carries the argument
UniBlendNet framework, which unifies a UniConvNet module for global long-range context, a Scale-Aware Aggregation Module for dynamic pyramid multi-scale aggregation, and mask-guided residual refinement for selective local enhancement.
If this is right
- The model achieves consistently higher restoration quality than the IFBlend baseline on the NTIRE benchmark.
- It produces visually more natural and stable results under complex lighting conditions.
- Illumination consistency and structural fidelity improve through selective enhancement of degraded regions while preserving well-exposed areas.
- The design reduces suboptimal performance in regions where prior frequency-domain methods struggle with limited context or adaptivity.
Where Pith is reading between the lines
- The same unification of global context, multi-scale dynamics, and mask-guided adaptation could apply to related tasks like shadow removal or low-light enhancement without major redesign.
- If the dynamic reweighting proves stable across datasets, it might reduce reliance on task-specific hyperparameter searches in other pyramid-based vision models.
- Deployment in consumer photography apps could follow if inference speed is measured and optimized, since the selective refinement already targets only degraded areas.
Load-bearing premise
Adding the long-range UniConvNet modeling, dynamic pyramid aggregation, and mask-guided refinement will improve restoration quality and naturalness without creating new artifacts or requiring extensive per-benchmark tuning.
What would settle it
If experiments on the NTIRE Ambient Lighting Normalization benchmark show that UniBlendNet fails to outperform IFBlend in restoration metrics or produces visible artifacts in challenging regions, the claim of effective unified modeling would be disproved.
Figures
read the original abstract
Ambient Lighting Normalization (ALN) aims to restore images degraded by complex, spatially varying illumination conditions. Existing methods, such as IFBlend, leverage frequency-domain priors to model illumination variations, but still suffer from limited global context modeling and insufficient spatial adaptivity, leading to suboptimal restoration in challenging regions. In this paper, we propose UniBlendNet, a unified framework for ambient lighting normalization that jointly models global illumination, multi-scale structures, and region-adaptive refinement. Specifically, we enhance global illumination understanding by integrating a UniConvNet-based module to capture long-range dependencies. To better handle complex lighting variations, we introduce a Scale-Aware Aggregation Module (SAAM) that performs pyramid-based multi-scale feature aggregation with dynamic reweighting. Furthermore, we design a mask-guided residual refinement mechanism to enable region-adaptive correction, allowing the model to selectively enhance degraded regions while preserving well-exposed areas. This design effectively improves illumination consistency and structural fidelity under complex lighting conditions. Extensive experiments on the NTIRE Ambient Lighting Normalization benchmark demonstrate that UniBlendNet consistently outperforms the baseline IFBlend and achieves improved restoration quality, while producing visually more natural and stable restoration results.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes UniBlendNet, a unified neural architecture for ambient lighting normalization (ALN) that integrates a UniConvNet-based module for capturing long-range global illumination dependencies, a Scale-Aware Aggregation Module (SAAM) performing pyramid multi-scale feature aggregation with dynamic reweighting, and a mask-guided residual refinement mechanism for region-adaptive correction. It claims that this design improves illumination consistency and structural fidelity over the frequency-domain baseline IFBlend, with extensive experiments on the NTIRE ALN benchmark showing consistent outperformance and more natural restoration results.
Significance. If the performance claims hold under rigorous evaluation, the work could advance image restoration methods for spatially varying illumination by providing a unified framework that combines global context modeling, multi-scale handling, and selective refinement—addressing documented limitations of prior approaches like IFBlend. This may have practical value in applications such as photography enhancement and computer vision under uncontrolled lighting.
major comments (1)
- Abstract and Experiments section: The central claim that 'UniBlendNet consistently outperforms the baseline IFBlend' and achieves 'improved restoration quality' rests entirely on unspecified 'extensive experiments' with no reported quantitative metrics (PSNR, SSIM, LPIPS, etc.), ablation studies on the individual modules (UniConvNet, SAAM, mask-guided refinement), dataset statistics, or error analysis. This absence makes the performance gains unverifiable and prevents assessment of whether the architectural additions deliver the claimed benefits without new artifacts.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the need for verifiable experimental details. We agree that the current presentation of results requires strengthening with explicit metrics and analyses, and we will revise the manuscript to address this fully.
read point-by-point responses
-
Referee: Abstract and Experiments section: The central claim that 'UniBlendNet consistently outperforms the baseline IFBlend' and achieves 'improved restoration quality' rests entirely on unspecified 'extensive experiments' with no reported quantitative metrics (PSNR, SSIM, LPIPS, etc.), ablation studies on the individual modules (UniConvNet, SAAM, mask-guided refinement), dataset statistics, or error analysis. This absence makes the performance gains unverifiable and prevents assessment of whether the architectural additions deliver the claimed benefits without new artifacts.
Authors: We agree that the abstract and experiments section as currently written does not include the specific quantitative metrics, ablation studies, dataset statistics, or error analysis needed to substantiate the claims. In the revised manuscript, we will add a dedicated experiments section with tables reporting PSNR, SSIM, LPIPS, and other relevant metrics comparing UniBlendNet to IFBlend on the NTIRE ALN benchmark. We will also include ablation studies that isolate the contributions of the UniConvNet module, the Scale-Aware Aggregation Module (SAAM), and the mask-guided residual refinement. Dataset statistics (e.g., number of images, lighting variation characteristics) and qualitative/quantitative error analysis will be provided to show where gains occur and to confirm that no new artifacts are introduced. These additions will make the performance improvements verifiable and directly address whether each architectural component delivers the claimed benefits. revision: yes
Circularity Check
No significant circularity in empirical architecture proposal
full rationale
The paper describes an empirical neural network architecture (UniBlendNet) with modules for global modeling, multi-scale aggregation, and region-adaptive refinement, evaluated on the NTIRE benchmark against IFBlend. No equations, parameter fits, or derivations are present that reduce by construction to inputs; claims rest on architectural description and experimental results rather than self-referential logic or self-citation chains.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Argan: Attentive recurrent generative adversarial network for shadow detection and removal
Bin Ding, Chengjiang Long, Ling Zhang, and Chunxia Xiao. Argan: Attentive recurrent generative adversarial network for shadow detection and removal. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10213–10222, 2019
work page 2019
-
[2]
Shadowrefiner: Towards mask-free shadow removal via fast fourier transformer
Wei Dong, Han Zhou, Yuqiong Tian, Jingke Sun, Xiaohong Liu, Guangtao Zhai, and Jun Chen. Shadowrefiner: Towards mask-free shadow removal via fast fourier transformer. In CVPRW, 2024
work page 2024
-
[3]
Dehazedct: Towards effective non- homogeneous dehazing via deformable convolutional trans- former
Wei Dong, Han Zhou, Ruiyi Wang, Xiaohong Liu, Guang- tao Zhai, and Jun Chen. Dehazedct: Towards effective non- homogeneous dehazing via deformable convolutional trans- former. InCVPRW, 2024
work page 2024
-
[4]
Wei Dong, Han Zhou, Yulun Zhang, Xiaohong Liu, and Jun Chen. Ecmamba: Consolidating selective state space model with retinex guidance for efficient multiple exposure correc- tion.Advances in Neural Information Processing Systems, 2024
work page 2024
-
[5]
To- wards scale-aware low-light enhancement via structure- guided transformer design
Wei Dong, Yan Min, Han Zhou, and Jun Chen. To- wards scale-aware low-light enhancement via structure- guided transformer design. InCVPRW, 2025
work page 2025
-
[6]
Retinex-guided histogram transformer for mask-free shadow removal
Wei Dong, Han Zhou, Seyed Amirreza Mousavi, and Jun Chen. Retinex-guided histogram transformer for mask-free shadow removal. InCVPRW, 2025
work page 2025
-
[7]
Wei Dong, Han Zhou, Junwei Lin, and Jun Chen. Zero- reference joint low-light enhancement and deblurring via vi- sual autoregressive modeling with vlm-derived modulation. InAAAI, 2026
work page 2026
-
[8]
An image is worth 16x16 words: Trans- formers for image recognition at scale
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, et al. An image is worth 16x16 words: Trans- formers for image recognition at scale. InInternational Con- ference on Learning Representations (ICLR), 2021
work page 2021
-
[9]
Graham D. Finlayson, Steven D. Hordley, and Mark S. Drew. Removing shadows from images. InEuropean Conference on Computer Vision (ECCV), pages 823–836, 2002
work page 2002
-
[10]
Graham D. Finlayson, Steven D. Hordley, Cheng Lu, and Mark S. Drew. On the removal of shadows from images. IEEE Transactions on Pattern Analysis and Machine Intelli- gence, 28(1):59–68, 2005
work page 2005
-
[11]
Graham D. Finlayson, Mark S. Drew, and Cheng Lu. En- tropy minimization for shadow removal.International Jour- nal of Computer Vision, 85(1):35–57, 2009
work page 2009
-
[12]
Shadowformer: Global context helps image shadow re- moval
Lanqing Guo, Siyu Huang, Ding Liu, Hao Cheng, and Bihan Wen. Shadowformer: Global context helps image shadow re- moval. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2023
work page 2023
-
[13]
Shadowd- iffusion: When degradation prior meets diffusion model for shadow removal
Lanqing Guo, Chong Wang, Wenhan Yang, Siyu Huang, Yufei Wang, Hanspeter Pfister, and Bihan Wen. Shadowd- iffusion: When degradation prior meets diffusion model for shadow removal. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), pages 14049–14058, 2023
work page 2023
-
[14]
Ruiqi Guo, Qieyun Dai, and Derek Hoiem. Paired regions for shadow detection and removal.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 35(12):2956–2967, 2013
work page 2013
-
[15]
Shengfeng He, Bing Peng, Junyu Dong, and Yong Du. Maskshadownet: Toward shadow removal via masked adap- tive instance normalization.IEEE Signal Processing Letters, 28:1699–1703, 2021
work page 2021
-
[16]
Coordinate atten- tion for efficient mobile network design
Qibin Hou, Daquan Zhou, and Jiashi Feng. Coordinate atten- tion for efficient mobile network design. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 13713–13722, 2021
work page 2021
-
[17]
Xiaowei Hu, Tianyu Wang, Chi-Wing Fu, Yitong Jiang, Qiong Wang, and Pheng-Ann Heng. Revisiting shadow de- tection: A new benchmark dataset for complex world.IEEE Transactions on Image Processing, 30:1925–1938, 2021
work page 1925
-
[18]
Xiaowei Hu, Zhenghao Xing, Tianyu Wang, Chi-Wing Fu, and Pheng-Ann Heng. Unveiling deep shadows: A survey on image and video shadow detection, removal, and generation in the era of deep learning.arXiv preprint arXiv:2409.02108, 2024
-
[19]
Leiping Jie and Hui Zhang. Mgrln-net: Mask-guided resid- ual learning network for joint single-image shadow detection and removal. InProceedings of the Asian Conference on Computer Vision (ACCV), 2022
work page 2022
-
[20]
Yeying Jin, Aashish Sharma, and Robby T. Tan. Dc- shadownet: Single-image hard and soft shadow removal using unsupervised domain-classifier guided network. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 5027–5036, 2021
work page 2021
-
[21]
Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization.International Conference on Learn- ing Representations, 2015
work page 2015
-
[22]
Shadow removal via shadow image decomposition
Hieu Le and Dimitris Samaras. Shadow removal via shadow image decomposition. InProceedings of the IEEE/CVF In- ternational Conference on Computer Vision (ICCV), pages 8578–8587, 2019
work page 2019
-
[23]
From shadow segmentation to shadow removal
Hieu Le and Dimitris Samaras. From shadow segmentation to shadow removal. InComputer Vision – ECCV 2020, pages 257–272. Springer, 2020
work page 2020
-
[24]
Swinir: Image restoration using swin transformer
Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. Swinir: Image restoration using swin transformer. InProceedings of the IEEE/CVF In- ternational Conference on Computer Vision Workshops (IC- CVW), pages 1833–1844, 2021
work page 2021
-
[25]
Grid- dehazenet: Attention-based multi-scale network for image dehazing
Xiaohong Liu, Yongrui Ma, Zhihao Shi, and Jun Chen. Grid- dehazenet: Attention-based multi-scale network for image dehazing. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019
work page 2019
-
[26]
Xiaohong Liu, Zhihao Shi, Zijun Wu, Jun Chen, and Guang- tao Zhai. Griddehazenet+: An enhanced multi-scale network with intra-task knowledge transfer for single image dehaz- ing.IEEE Transactions on Intelligent Transportation Sys- tems, 2022
work page 2022
- [27]
-
[28]
Swin transformer: Hierarchical vision transformer using shifted windows
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), pages 10012–10022, 2021
work page 2021
-
[29]
From shadow generation to shadow removal
Zhihao Liu, Hui Yin, Xinyi Wu, Zhenyao Wu, Yang Mi, and Song Wang. From shadow generation to shadow removal. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4927–4936, 2021
work page 2021
-
[30]
Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feicht- enhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 11976– 11986, 2022
work page 2022
-
[31]
Hirformer: Dynamic high resolution transformer for large-scale image shadow removal
Xin Lu, Yurui Zhu, Xi Wang, Dong Li, Jie Xiao, Yun- peng Zhang, Xueyang Fu, and Zheng-Jun Zha. Hirformer: Dynamic high resolution transformer for large-scale image shadow removal. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition Work- shops (CVPRW), pages 6513–6523, 2024
work page 2024
-
[32]
St ´ephane G. Mallat. A theory for multiresolution signal de- composition: The wavelet representation.IEEE Transac- tions on Pattern Analysis and Machine Intelligence, 11(7): 674–693, 1989
work page 1989
-
[33]
Deep multi-scale convolutional neural network for dynamic scene deblurring
Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee. Deep multi-scale convolutional neural network for dynamic scene deblurring. InProceedings of the IEEE Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 3883– 3891, 2017
work page 2017
-
[34]
Pytorch: An im- perative style, high-performance deep learning library
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An im- perative style, high-perfor...
work page 2019
-
[35]
Ffa-net: Feature fusion attention network for single image dehazing
Xu Qin, Zhilin Wang, Yuanchao Bai, Xiaodong Xie, and Huizhu Jia. Ffa-net: Feature fusion attention network for single image dehazing. InProceedings of the AAAI Confer- ence on Artificial Intelligence (AAAI), 2020
work page 2020
-
[36]
Liangqiong Qu, Jiandong Tian, Shengfeng He, Yandong Tang, and Rynson W. H. Lau. Deshadownet: A multi-context embedding deep network for shadow removal. InProceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4067–4075, 2017
work page 2017
-
[37]
U-net: Convolutional networks for biomedical image segmentation
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. InMedical Image Computing and Computer-Assisted Inter- vention (MICCAI), pages 234–241, 2015
work page 2015
-
[38]
David Serrano-Lozano, Francisco A. Molina-Bakhos, Danna Xue, Yixiong Yang, Maria Pilligua, Ramon Baldrich, Maria Vanrell, and Javier Vazquez-Corral. Promptnorm: Image ge- ometry guides ambient light normalization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR) Workshops, pages 905–916, 2025
work page 2025
-
[39]
Vision transformers for single image dehazing.IEEE Transactions on Image Processing, 2023
Yuda Song, Zhuqing He, Hui Qian, and Xin Du. Vision transformers for single image dehazing.IEEE Transactions on Image Processing, 2023
work page 2023
-
[40]
Wsrd: A novel benchmark for high resolution image shadow removal
Florin-Alexandru Vasluianu, Tim Seizinger, and Radu Tim- ofte. Wsrd: A novel benchmark for high resolution image shadow removal. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition Work- shops (CVPRW), 2023
work page 2023
-
[42]
Ntire 2025 ambient lighting normalization challenge report
Florin-Alexandru Vasluianu, Tim Seizinger, Zhuyun Zhou, Zongwei Wu, Radu Timofte, et al. Ntire 2025 ambient lighting normalization challenge report. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2025
work page 2025
-
[43]
Ntire 2026 ambient lighting normaliza- tion challenge report
Florin-Alexandru Vasluianu, Tim Seizinger, Zhuyun Zhou, Radu Timofte, et al. Ntire 2026 ambient lighting normaliza- tion challenge report. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition Work- shops (CVPRW), 2026
work page 2026
-
[44]
Towards image ambient lighting normalization.arXiv preprint arXiv:2403.18730, 2024
Florin-Alexandru Vasluianu et al. Towards image ambient lighting normalization.arXiv preprint arXiv:2403.18730, 2024
-
[45]
Yago Vicente and Dimitris Samaras
Tomas F. Yago Vicente and Dimitris Samaras. Single image shadow removal via neighbor-based region relighting. InEu- ropean Conference on Computer Vision (ECCV), pages 309– 320, 2014
work page 2014
-
[46]
Jifeng Wang, Xiang Li, and Jian Yang. Stacked conditional generative adversarial networks for jointly learning shadow detection and shadow removal. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1788–1797, 2018
work page 2018
-
[47]
Yuhao Wang and Wei Xi. Uniconvnet: Expanding effec- tive receptive field while maintaining asymptotically gaus- sian distribution for convnets of any scale.arXiv preprint arXiv:2508.09000, 2025
-
[48]
Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. Image quality assessment: From error visibility to structural similarity.IEEE Transactions on Image Pro- cessing, 13(4):600–612, 2004
work page 2004
-
[49]
Homoformer: Homogenized transformer for image shadow removal
Jie Xiao, Xueyang Fu, Yurui Zhu, Dong Li, Jie Huang, Kai Zhu, and Zheng-Jun Zha. Homoformer: Homogenized transformer for image shadow removal. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 25617–25626, 2024
work page 2024
-
[50]
Shadow-aware dynamic convolution for shadow removal.Pattern Recognition, 2024
Yimin Xu, Mingbao Lin, Hong Yang, Fei Chao, and Ron- grong Ji. Shadow-aware dynamic convolution for shadow removal.Pattern Recognition, 2024
work page 2024
-
[51]
Towards efficient and scale-robust ultra- high-definition image demoireing
Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Jiajun Shen, Jia Li, and Xiaojuan Qi. Towards efficient and scale-robust ultra- high-definition image demoireing. InEuropean Conference on Computer Vision (ECCV), 2022
work page 2022
-
[52]
Restormer: Efficient transformer for high-resolution image restoration
Syed Waqas Zamir, Aditya Arora, Salman Khan, Mu- nawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image restoration. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
work page 2022
-
[53]
Ling Zhang, Qing Zhang, and Chunxia Xiao. Shadow re- mover: Image shadow removal based on illumination recov- ering optimization.IEEE Transactions on Image Processing, 24(11):4623–4636, 2015
work page 2015
-
[54]
Efros, Eli Shecht- man, and Oliver Wang
Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shecht- man, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 586–595, 2018
work page 2018
-
[55]
Han Zhou, Wei Dong, Yangyi Liu, and Jun Chen. Breaking through the haze: An advanced non-homogeneous dehazing method based on fast fourier convolution and convnext. In CVPRW, 2023
work page 2023
-
[56]
Glare: Low light image enhancement via generative latent feature based codebook retrieval
Han Zhou, Wei Dong, Xiaohong Liu, Shuaicheng Liu, Xiongkuo Min, Guangtao Zhai, and Jun Chen. Glare: Low light image enhancement via generative latent feature based codebook retrieval. InECCV, 2024
work page 2024
-
[57]
Han Zhou, Wei Dong, and Jun Chen. Lita-gs: Illumination- agnostic novel view synthesis via reference-free 3d gaussian splatting and physical priors. InCVPR, 2025
work page 2025
-
[58]
Low-light image enhancement via generative perceptual priors
Han Zhou, Wei Dong, Xiaohong Liu, Yulun Zhang, Guang- tao Zhai, and Jun Chen. Low-light image enhancement via generative perceptual priors. InAAAI, 2025
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.