BVI-RLV: A Fully Registered Dataset for Low-Light Video Enhancement

Alexandra Malyugina; David R Bull; Guoxi Huang; Joanne Lin; Nantheera Anantrasirichai; Qi Sun; Ruirui Lin

arxiv: 2407.03535 · v3 · pith:YYMGLY5Vnew · submitted 2024-07-03 · 💻 cs.CV

BVI-RLV: A Fully Registered Dataset for Low-Light Video Enhancement

Ruirui Lin , Guoxi Huang , Joanne Lin , Qi Sun , Alexandra Malyugina , David R Bull , Nantheera Anantrasirichai This is my paper

Pith reviewed 2026-05-25 08:48 UTC · model grok-4.3

classification 💻 cs.CV

keywords low-light video enhancementregistered datasetpaired framessub-pixel registrationsupervised learningvideo denoisingmotion capturedeep learning datasets

0 comments

The pith

BVI-RLV supplies over 30k sub-pixel registered low-light to normal-light video frame pairs from 40 scenes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces BVI-RLV as a dataset of paired low-light and normal-light videos that avoids the misalignment common in prior collections. It uses a motorized dolly plus image refinement to align frames at sub-pixel accuracy across dynamic motions and full HD resolution. Experiments establish that training enhancement models on these registered pairs produces higher quality outputs than training on misaligned data from the same scenes. The dataset also yields models that generalize better than those from existing collections when tested across datasets and in outdoor scenes. Baselines are supplied for CNN, Transformer, Mamba, and diffusion architectures.

Core claim

BVI-RLV comprises over 30k paired frames from 40 diverse scenes captured under two low-light conditions and aligned to normal-light ground truth. Sub-pixel registration holds for 99.24 percent of the full-HD data through motorized dolly motion combined with image-based refinement, while covering varied motion types and realistic temporal noise. Registration proves essential for supervised learning, delivering up to 5.85 dB PSNR gains over unregistered training, and models trained on the dataset outperform those from prior collections in cross-dataset tests, including real-world outdoor scenes.

What carries the argument

Motorized dolly movement combined with image-based refinement to produce sub-pixel accurate alignment between low-light and normal-light video frames.

If this is right

Training enhancement networks on the registered pairs raises PSNR by as much as 5.85 dB relative to unregistered versions of the same data.
Models trained on BVI-RLV exceed the cross-dataset performance of models trained on existing low-light collections.
The dataset supports training that generalizes to real-world outdoor low-light video.
Baseline results for CNN, Transformer, Mamba, and diffusion models become available for direct comparison.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Precise frame alignment may enable models to exploit temporal correlations more effectively than misalignment permits.
The capture method could be adapted to other video tasks that require exact low-light to reference pairing.
Public release of the paired sequences may allow researchers to test whether registration quality correlates with gains in temporal consistency metrics.
Superior outdoor performance hints that the dataset captures noise statistics closer to uncontrolled environments than ND-filter approaches.

Load-bearing premise

The dolly motion and refinement process yields alignments that stay sub-pixel accurate and artifact-free across all scenes without introducing systematic biases into downstream model training.

What would settle it

A direct comparison of model performance when trained on BVI-RLV pairs versus the same scenes captured with handheld or static-camera methods that lack the dolly alignment step.

Figures

Figures reproduced from arXiv: 2407.03535 by Alexandra Malyugina, David R Bull, Guoxi Huang, Joanne Lin, Nantheera Anantrasirichai, Qi Sun, Ruirui Lin.

**Figure 2.** Figure 2: Models trained on BVI-RLV generate higher results compared to the other three LLVE [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: (Left) Scene setting showing the camera in ‘angle’ position, mounted on CineDrive system. [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Cropped images (Lego and Kitchen scenes at 350 [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Main architectural components of the four different benchmarking methods, i.e., PCDUNet, [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Subjective results of the BVI-CDM model trained on different datasets, and tested on [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Subjective results of the STA-SUNet model trained on different datasets, and tested on the [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

read the original abstract

Low-light videos often exhibit spatiotemporally incoherent noise, compromising visibility and degrading performance in computer vision applications. A major challenge for enhancing such content using deep learning lies in the scarcity of pixel-aligned, high-quality training data. We introduce BVI-RLV, a fully registered low-light video dataset comprising over 30k paired frames from 40 diverse scenes under two low-light conditions, each aligned with normal-light ground truth. Unlike existing datasets that rely on neutral density (ND) filters or suffer from misalignment issues, BVI-RLV achieves sub-pixel registration for 99.24% of data at full HD resolution across dynamic motion scenarios using a motorized dolly and image-based refinement. The dataset covers a wide range of motion types and realistic temporal noise. We also provide baseline implementations using four representative architectures: Convolutional Neural Network (CNN), Transformer, State Space Model (Mamba), and Diffusion Model (DM). Experiments demonstrate that registration is crucial for supervised learning, yielding up to 5.85 dB PSNR improvement compared to unregistered training. Models trained on BVI-RLV outperform those trained on existing datasets in cross-dataset evaluations, achieving superior performance even in real-world outdoor scenes. Our dataset is publicly available at https://doi.org/10.21227/mzny-8c77.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

BVI-RLV gives a sizable registered low-light video dataset with useful baselines, but the sub-pixel registration figure lacks an independent check.

read the letter

The main takeaway is a new dataset of over 30k paired low-light and normal-light video frames from 40 scenes, with a motorized dolly plus refinement step that they say reaches sub-pixel alignment on 99.24% of the data at full HD. They also show that training on the registered pairs improves PSNR by up to 5.85 dB over unregistered versions and beats models trained on prior sets in cross tests, including real outdoor footage. They supply baselines across CNN, transformer, Mamba, and diffusion architectures. That scale and the explicit comparison of registered versus unregistered training are the concrete additions here. The public release helps too. The registration method itself is the soft spot. The abstract states the 99.24% figure but gives no external reference such as fiducial markers or multi-view consistency to confirm the alignments are free of systematic bias. If the percentage is derived only from the internal refinement step, it could mask small consistent errors that affect the reported training gains. The experiments look controlled enough for a dataset paper, but that one measurement needs clearer documentation. This work is aimed at people training or evaluating low-light video enhancement models. Anyone in that subfield would find the data and the registration experiment worth looking at. It is solid enough on its own terms to go to peer review rather than desk reject, mainly because the dataset is new, the claims are falsifiable once the data is public, and the baselines are reproducible. Ask the authors for the exact error metric they used to arrive at 99.24%.

Referee Report

2 major / 1 minor

Summary. The paper presents BVI-RLV, a new dataset of >30k paired low-light/normal-light video frames across 40 scenes captured with a motorized dolly plus image-based refinement, claiming 99.24% sub-pixel registration at full HD. It supplies baseline results for CNN, Transformer, Mamba, and diffusion models, reports up to 5.85 dB PSNR gain from using registered versus unregistered pairs, and shows superior cross-dataset generalization including on real outdoor scenes.

Significance. If the registration accuracy and lack of systematic bias hold, the dataset would fill a documented gap in aligned low-light video data and enable more reliable supervised training; the public release and multi-architecture baselines are concrete strengths that would support reproducibility and further work in the area.

major comments (2)

[Abstract / registration section] Abstract and registration-method description: the headline 99.24% sub-pixel registration figure is presented without an independent validation metric (e.g., residual error against fiducial markers, multi-view consistency, or external reference alignment); if the percentage is derived solely from internal convergence of the image-based refinement step, it cannot rule out consistent sub-pixel biases that would affect downstream supervised training and the reported 5.85 dB gain.
[Experiments / ablation studies] Experiments section (cross-dataset and registration-ablation results): the claim that registration is “crucial” and yields up to 5.85 dB improvement requires explicit confirmation that the unregistered training baseline used identical data volume, augmentation, optimizer schedule, and convergence criteria; without those controls the PSNR delta cannot be attributed solely to alignment quality.

minor comments (1)

[Dataset description] The motion-type taxonomy and noise-characterization details would benefit from an explicit table or figure summarizing the 40 scenes.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below.

read point-by-point responses

Referee: [Abstract / registration section] Abstract and registration-method description: the headline 99.24% sub-pixel registration figure is presented without an independent validation metric (e.g., residual error against fiducial markers, multi-view consistency, or external reference alignment); if the percentage is derived solely from internal convergence of the image-based refinement step, it cannot rule out consistent sub-pixel biases that would affect downstream supervised training and the reported 5.85 dB gain.

Authors: The 99.24% figure is computed from the residual displacement after the motorized dolly plus image-based refinement, with sub-pixel defined as <1 pixel error via feature matching. We agree this is an internal metric and does not provide fully independent validation (e.g., fiducial markers). In revision we will explicitly detail the computation, add discussion of possible systematic biases, and include multi-frame consistency checks from static scenes as supporting evidence. revision: yes
Referee: [Experiments / ablation studies] Experiments section (cross-dataset and registration-ablation results): the claim that registration is “crucial” and yields up to 5.85 dB improvement requires explicit confirmation that the unregistered training baseline used identical data volume, augmentation, optimizer schedule, and convergence criteria; without those controls the PSNR delta cannot be attributed solely to alignment quality.

Authors: The unregistered baseline used identical data volume, augmentations, optimizer, schedule, and convergence criteria; the sole difference was pair alignment. We will revise the experiments section to state these controls explicitly. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical dataset contribution with no derivation chain

full rationale

The paper presents an empirical dataset and baseline experiments rather than any mathematical derivation or fitted-parameter prediction. The sub-pixel registration claim is a reported measurement from the data collection process (motorized dolly + refinement), not a quantity derived from or fitted to the downstream PSNR results. Cross-dataset evaluations are standard empirical comparisons with no self-referential reduction. No equations, ansatzes, or uniqueness theorems are invoked that collapse to the paper's own inputs. This is a self-contained dataset paper; the reader's circularity score of 1.0 is consistent with the absence of load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a dataset creation and benchmarking paper with no mathematical derivations, free parameters, or new postulated entities; it relies on standard computer vision practices for registration and supervised training.

pith-pipeline@v0.9.0 · 5791 in / 1302 out tokens · 37690 ms · 2026-05-25T08:48:19.556838+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Towards a General-Purpose Zero-Shot Synthetic Low-Light Image and Video Pipeline
cs.CV 2025-04 unverdicted novelty 6.0

A self-supervised Degradation Estimation Network estimates parameters for physics-informed noise distributions to generate realistic synthetic low-light data, showing gains on noise replication, enhancement, and detec...

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · cited by 1 Pith paper · 3 internal anchors

[1]

Do, and Vladlen Koltun

Chen Chen, Qifeng Chen, Minh N. Do, and Vladlen Koltun. Seeing motion in the dark. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019

work page 2019
[2]

Seeing dynamic scene in the dark: High-quality video dataset with mechatronic alignment

Ruixing Wang, Xiaogang Xu, Chi-Wing Fu, Jiangbo Lu, Bei Yu, and Jiaya Jia. Seeing dynamic scene in the dark: High-quality video dataset with mechatronic alignment. In ICCV, 2021

work page 2021
[3]

Dancing in the dark: A benchmark towards general low-light video enhancement

Huiyuan Fu, Wenkai Zheng, Xicong Wang, Jiaxuan Wang, Heng Zhang, and Huadong Ma. Dancing in the dark: A benchmark towards general low-light video enhancement. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023

work page 2023
[4]

Self-supervised training for blind multi-frame video denoising

Valery Dewil, Jeremy Anger, Axel Davy, Thibaud Ehret, Gabriele Facciolo, and Pablo Arias. Self-supervised training for blind multi-frame video denoising. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 2724–2734, January 2021

work page 2021
[5]

Self-supervised low-light image enhancement using discrepant untrained network priors

Jinxiu Liang, Yong Xu, Yuhui Quan, Boxin Shi, and Hui Ji. Self-supervised low-light image enhancement using discrepant untrained network priors. IEEE Transactions on Circuits and Systems for Video Technology, 32(11):7332–7345, 2022. 9

work page 2022
[6]

Anantrasirichai and David Bull

N. Anantrasirichai and David Bull. Contextual colorization and denoising for low-light ultra high resolution sequences. In ICIP proc., pages 1614–1618, 2021

work page 2021
[7]

A topological loss function for image denoising on a new BVI-lowlight dataset

Alexandra Malyugina, Nantheera Anantrasirichai, and David Bull. A topological loss function for image denoising on a new BVI-lowlight dataset. Signal Processing, 211, 2023

work page 2023
[8]

Richter, Laura Waller, and Vladlen Koltun

Kristina Monakhova, Stephan R. Richter, Laura Waller, and Vladlen Koltun. Dancing under the stars: video denoising in starlight. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16220–16230, 2022

work page 2022
[9]

BDD100K: A diverse driving dataset for heterogeneous multitask learning

Fisher Yu, Haofeng Chen, Xin Wang, Wenqi Xian, Yingying Chen, Fangchen Liu, Vashisht Madhavan, and Trevor Darrell. BDD100K: A diverse driving dataset for heterogeneous multitask learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020

work page 2020
[10]

Learning to see moving objects in the dark

Haiyang Jiang and Yinqiang Zheng. Learning to see moving objects in the dark. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 7323–7332, 2019

work page 2019
[11]

Supervised raw video denoising with a benchmark dataset on dynamic scenes

Huanjing Yue, Cong Cao, Lei Liao, Ronghe Chu, and Jingyu Yang. Supervised raw video denoising with a benchmark dataset on dynamic scenes. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2298–2307, 2020

work page 2020
[12]

An uncompressed benchmark image dataset for colour imaging

Gerald Schaefer. An uncompressed benchmark image dataset for colour imaging. In 2010 IEEE International Conference on Image Processing, pages 3537–3540, 2010

work page 2010
[13]

Benchmarking denoising algorithms with real photographs

Tobias Plötz and Stefan Roth. Benchmarking denoising algorithms with real photographs. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2750–2759, 2017

work page 2017
[14]

Abdelhamed, S

A. Abdelhamed, S. Lin, and M.-S. Brown. A high-quality denoising dataset for smartphone cameras. In CVPR proc., pages 1692–1700, 2018

work page 2018
[15]

Low-light image and video enhancement using deep learning: A sur- vey

Chongyi Li, Chunle Guo, Linghao Han, Jun Jiang, Ming-Ming Cheng, Jinwei Gu, and Chen Change Loy. Low-light image and video enhancement using deep learning: A sur- vey. IEEE Transactions on Pattern Analysis and Machine Intelligence , 44(12):9396–9416, 2022

work page 2022
[16]

U-net: Convolutional networks for biomedical image segmentation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Inter- vention (MICCAI), 2015

work page 2015
[17]

Revisiting temporal alignment for video restoration

Kun Zhou, Wenbo Li, Liying Lu, Xiaoguang Han, and Jiangbo Lu. Revisiting temporal alignment for video restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

work page 2022
[18]

Enhancing low light videos by exploring high sensitivity camera noise

Wei Wang, Xin Chen, Cheng Yang, Xiang Li, Xuemei Hu, and Tao Yue. Enhancing low light videos by exploring high sensitivity camera noise. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 4110–4118, 2019

work page 2019
[19]

J. Dai, H. Qi, Y . Xiong, Y . Li, G. Zhang, H. Hu, and Y . Wei. Deformable convolutional networks. In ICCV, pages 764–773, Oct 2017

work page 2017
[20]

Low-light video enhancement with synthetic event guidance

Lin Liu, Junfeng An, Jianzhuang Liu, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Yan Feng Wang, and Qi Tian. Low-light video enhancement with synthetic event guidance. Proceedings of the AAAI Conference on Artificial Intelligence, 37(2):1692–1700, Jun. 2023

work page 2023
[21]

Low light video enhancement using synthetic data produced with an intermediate domain mapping

Danai Triantafyllidou, Sean Moran, Steven McDonagh, Sarah Parisot, and Gregory Slabaugh. Low light video enhancement using synthetic data produced with an intermediate domain mapping. In European Conference on Computer Vision, pages 103–119. Springer, 2020

work page 2020
[22]

Anantrasirichai, Alin Achim, and David Bull

N. Anantrasirichai, Alin Achim, and David Bull. Atmospheric turbulence mitigation for sequences with moving objects using recursive image fusion. In 2018 25th IEEE International Conference on Image Processing (ICIP), pages 2895–2899, 2018

work page 2018
[23]

Image registration by local histogram matching

Dinggang Shen. Image registration by local histogram matching. Pattern Recognition, 40(4):1161–1172, 2007

work page 2007
[24]

Sarvaiya, Suprava Patnaik, and Salman Bombaywala

J.N. Sarvaiya, Suprava Patnaik, and Salman Bombaywala. Image registration by template matching using normalized cross-correlation. In 2009 International Conference on Advances in Computing, Control, and Telecommunication Technologies, pages 819–822, 2009. 10

work page 2009
[25]

Noise flow: Noise modeling with conditional normalizing flows

Abdelrahman Abdelhamed, Marcus Brubaker, and Michael Brown. Noise flow: Noise modeling with conditional normalizing flows. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 3165–3173, 2019

work page 2019
[26]

A spatio-temporal aligned sunet model for low-light video enhancement

Ruirui Lin, Nantheera Anantrasirichai, Alexandra Malyugina, and David Bull. A spatio-temporal aligned sunet model for low-light video enhancement. In Submitting to IEEE International Conference on Image Processing, 2024

work page 2024
[27]

Denoising diffusion implicit models

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. ICLR, 2021

work page 2021
[28]

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[29]

VMamba: Visual State Space Model

Yue Liu, Yunjie Tian, Yuzhong Zhao, Hongtian Yu, Lingxi Xie, Yaowei Wang, Qixiang Ye, and Yunfan Liu. Vmamba: Visual state space model. arXiv preprint arXiv:2401.10166, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[30]

Low-light image enhancement with wavelet-based diffusion models

Hai Jiang, Ao Luo, Haoqiang Fan, Songchen Han, and Shuaicheng Liu. Low-light image enhancement with wavelet-based diffusion models. ACM Transactions on Graphics (TOG), 42(6):1–14, 2023

work page 2023
[31]

Chan, Ke Yu, Chao Dong, and Chen Change Loy

Xintao Wang, Kelvin C.K. Chan, Ke Yu, Chao Dong, and Chen Change Loy. EDVR: Video restoration with enhanced deformable convolutional networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019

work page 2019
[32]

Sendur and I.W

L. Sendur and I.W. Selesnick. Bivariate shrinkage functions for wavelet-based denoising exploiting interscale dependency. IEEE Transactions on Signal Processing, 50(11):2744–2756, 2002

work page 2002
[33]

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Lianghui Zhu, Bencheng Liao, Qian Zhang, Xinlong Wang, Wenyu Liu, and Xinggang Wang. Vision mamba: Efficient visual representation learning with bidirectional state space model. arXiv preprint arXiv:2401.09417, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[34]

Swinir: Image restoration using swin transformer

Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. Swinir: Image restoration using swin transformer. In 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pages 1833–1844, 2021

work page 2021
[35]

Holcombe

Alex O. Holcombe. Seeing slow and seeing fast: two limits on perception. Trends in Cognitive Sciences, pages 216–221, 2009

work page 2009
[36]

High-resolution image synthesis and semantic manipulation with conditional gans

Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. High-resolution image synthesis and semantic manipulation with conditional gans. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8798–8807, 2018

work page 2018
[37]

Real image denoising with feature attention

Saeed Anwar and Nick Barnes. Real image denoising with feature attention. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 3155–3164, 2019. 11

work page 2019

[1] [1]

Do, and Vladlen Koltun

Chen Chen, Qifeng Chen, Minh N. Do, and Vladlen Koltun. Seeing motion in the dark. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019

work page 2019

[2] [2]

Seeing dynamic scene in the dark: High-quality video dataset with mechatronic alignment

Ruixing Wang, Xiaogang Xu, Chi-Wing Fu, Jiangbo Lu, Bei Yu, and Jiaya Jia. Seeing dynamic scene in the dark: High-quality video dataset with mechatronic alignment. In ICCV, 2021

work page 2021

[3] [3]

Dancing in the dark: A benchmark towards general low-light video enhancement

Huiyuan Fu, Wenkai Zheng, Xicong Wang, Jiaxuan Wang, Heng Zhang, and Huadong Ma. Dancing in the dark: A benchmark towards general low-light video enhancement. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023

work page 2023

[4] [4]

Self-supervised training for blind multi-frame video denoising

Valery Dewil, Jeremy Anger, Axel Davy, Thibaud Ehret, Gabriele Facciolo, and Pablo Arias. Self-supervised training for blind multi-frame video denoising. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 2724–2734, January 2021

work page 2021

[5] [5]

Self-supervised low-light image enhancement using discrepant untrained network priors

Jinxiu Liang, Yong Xu, Yuhui Quan, Boxin Shi, and Hui Ji. Self-supervised low-light image enhancement using discrepant untrained network priors. IEEE Transactions on Circuits and Systems for Video Technology, 32(11):7332–7345, 2022. 9

work page 2022

[6] [6]

Anantrasirichai and David Bull

N. Anantrasirichai and David Bull. Contextual colorization and denoising for low-light ultra high resolution sequences. In ICIP proc., pages 1614–1618, 2021

work page 2021

[7] [7]

A topological loss function for image denoising on a new BVI-lowlight dataset

Alexandra Malyugina, Nantheera Anantrasirichai, and David Bull. A topological loss function for image denoising on a new BVI-lowlight dataset. Signal Processing, 211, 2023

work page 2023

[8] [8]

Richter, Laura Waller, and Vladlen Koltun

Kristina Monakhova, Stephan R. Richter, Laura Waller, and Vladlen Koltun. Dancing under the stars: video denoising in starlight. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16220–16230, 2022

work page 2022

[9] [9]

BDD100K: A diverse driving dataset for heterogeneous multitask learning

Fisher Yu, Haofeng Chen, Xin Wang, Wenqi Xian, Yingying Chen, Fangchen Liu, Vashisht Madhavan, and Trevor Darrell. BDD100K: A diverse driving dataset for heterogeneous multitask learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020

work page 2020

[10] [10]

Learning to see moving objects in the dark

Haiyang Jiang and Yinqiang Zheng. Learning to see moving objects in the dark. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 7323–7332, 2019

work page 2019

[11] [11]

Supervised raw video denoising with a benchmark dataset on dynamic scenes

Huanjing Yue, Cong Cao, Lei Liao, Ronghe Chu, and Jingyu Yang. Supervised raw video denoising with a benchmark dataset on dynamic scenes. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2298–2307, 2020

work page 2020

[12] [12]

An uncompressed benchmark image dataset for colour imaging

Gerald Schaefer. An uncompressed benchmark image dataset for colour imaging. In 2010 IEEE International Conference on Image Processing, pages 3537–3540, 2010

work page 2010

[13] [13]

Benchmarking denoising algorithms with real photographs

Tobias Plötz and Stefan Roth. Benchmarking denoising algorithms with real photographs. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2750–2759, 2017

work page 2017

[14] [14]

Abdelhamed, S

A. Abdelhamed, S. Lin, and M.-S. Brown. A high-quality denoising dataset for smartphone cameras. In CVPR proc., pages 1692–1700, 2018

work page 2018

[15] [15]

Low-light image and video enhancement using deep learning: A sur- vey

Chongyi Li, Chunle Guo, Linghao Han, Jun Jiang, Ming-Ming Cheng, Jinwei Gu, and Chen Change Loy. Low-light image and video enhancement using deep learning: A sur- vey. IEEE Transactions on Pattern Analysis and Machine Intelligence , 44(12):9396–9416, 2022

work page 2022

[16] [16]

U-net: Convolutional networks for biomedical image segmentation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Inter- vention (MICCAI), 2015

work page 2015

[17] [17]

Revisiting temporal alignment for video restoration

Kun Zhou, Wenbo Li, Liying Lu, Xiaoguang Han, and Jiangbo Lu. Revisiting temporal alignment for video restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

work page 2022

[18] [18]

Enhancing low light videos by exploring high sensitivity camera noise

Wei Wang, Xin Chen, Cheng Yang, Xiang Li, Xuemei Hu, and Tao Yue. Enhancing low light videos by exploring high sensitivity camera noise. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 4110–4118, 2019

work page 2019

[19] [19]

J. Dai, H. Qi, Y . Xiong, Y . Li, G. Zhang, H. Hu, and Y . Wei. Deformable convolutional networks. In ICCV, pages 764–773, Oct 2017

work page 2017

[20] [20]

Low-light video enhancement with synthetic event guidance

Lin Liu, Junfeng An, Jianzhuang Liu, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Yan Feng Wang, and Qi Tian. Low-light video enhancement with synthetic event guidance. Proceedings of the AAAI Conference on Artificial Intelligence, 37(2):1692–1700, Jun. 2023

work page 2023

[21] [21]

Low light video enhancement using synthetic data produced with an intermediate domain mapping

Danai Triantafyllidou, Sean Moran, Steven McDonagh, Sarah Parisot, and Gregory Slabaugh. Low light video enhancement using synthetic data produced with an intermediate domain mapping. In European Conference on Computer Vision, pages 103–119. Springer, 2020

work page 2020

[22] [22]

Anantrasirichai, Alin Achim, and David Bull

N. Anantrasirichai, Alin Achim, and David Bull. Atmospheric turbulence mitigation for sequences with moving objects using recursive image fusion. In 2018 25th IEEE International Conference on Image Processing (ICIP), pages 2895–2899, 2018

work page 2018

[23] [23]

Image registration by local histogram matching

Dinggang Shen. Image registration by local histogram matching. Pattern Recognition, 40(4):1161–1172, 2007

work page 2007

[24] [24]

Sarvaiya, Suprava Patnaik, and Salman Bombaywala

J.N. Sarvaiya, Suprava Patnaik, and Salman Bombaywala. Image registration by template matching using normalized cross-correlation. In 2009 International Conference on Advances in Computing, Control, and Telecommunication Technologies, pages 819–822, 2009. 10

work page 2009

[25] [25]

Noise flow: Noise modeling with conditional normalizing flows

Abdelrahman Abdelhamed, Marcus Brubaker, and Michael Brown. Noise flow: Noise modeling with conditional normalizing flows. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 3165–3173, 2019

work page 2019

[26] [26]

A spatio-temporal aligned sunet model for low-light video enhancement

Ruirui Lin, Nantheera Anantrasirichai, Alexandra Malyugina, and David Bull. A spatio-temporal aligned sunet model for low-light video enhancement. In Submitting to IEEE International Conference on Image Processing, 2024

work page 2024

[27] [27]

Denoising diffusion implicit models

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models. ICLR, 2021

work page 2021

[28] [28]

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[29] [29]

VMamba: Visual State Space Model

Yue Liu, Yunjie Tian, Yuzhong Zhao, Hongtian Yu, Lingxi Xie, Yaowei Wang, Qixiang Ye, and Yunfan Liu. Vmamba: Visual state space model. arXiv preprint arXiv:2401.10166, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[30] [30]

Low-light image enhancement with wavelet-based diffusion models

Hai Jiang, Ao Luo, Haoqiang Fan, Songchen Han, and Shuaicheng Liu. Low-light image enhancement with wavelet-based diffusion models. ACM Transactions on Graphics (TOG), 42(6):1–14, 2023

work page 2023

[31] [31]

Chan, Ke Yu, Chao Dong, and Chen Change Loy

Xintao Wang, Kelvin C.K. Chan, Ke Yu, Chao Dong, and Chen Change Loy. EDVR: Video restoration with enhanced deformable convolutional networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019

work page 2019

[32] [32]

Sendur and I.W

L. Sendur and I.W. Selesnick. Bivariate shrinkage functions for wavelet-based denoising exploiting interscale dependency. IEEE Transactions on Signal Processing, 50(11):2744–2756, 2002

work page 2002

[33] [33]

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Lianghui Zhu, Bencheng Liao, Qian Zhang, Xinlong Wang, Wenyu Liu, and Xinggang Wang. Vision mamba: Efficient visual representation learning with bidirectional state space model. arXiv preprint arXiv:2401.09417, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[34] [34]

Swinir: Image restoration using swin transformer

Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. Swinir: Image restoration using swin transformer. In 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pages 1833–1844, 2021

work page 2021

[35] [35]

Holcombe

Alex O. Holcombe. Seeing slow and seeing fast: two limits on perception. Trends in Cognitive Sciences, pages 216–221, 2009

work page 2009

[36] [36]

High-resolution image synthesis and semantic manipulation with conditional gans

Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. High-resolution image synthesis and semantic manipulation with conditional gans. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8798–8807, 2018

work page 2018

[37] [37]

Real image denoising with feature attention

Saeed Anwar and Nick Barnes. Real image denoising with feature attention. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 3155–3164, 2019. 11

work page 2019