pith. machine review for the scientific record. sign in

arxiv: 2605.10275 · v1 · submitted 2026-05-11 · 💻 cs.CV

Recognition: no theorem link

PolarVSR: A Unified Framework and Benchmark for Continuous Space-Time Polarization Video Reconstruction

Authors on Pith no claims yet

Pith reviewed 2026-05-12 05:21 UTC · model grok-4.3

classification 💻 cs.CV
keywords polarization video reconstructionDoFP imagingimplicit neural representationspace-time super-resolutionpolarization benchmarkflow-guided lossvideo enhancement
0
0 comments X

The pith

The first architecture reconstructs polarization videos continuously across space and time from DoFP captures.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes the first unified framework for reconstructing color polarization videos in continuous space and time. It does this by jointly modeling polarization directions with a polarization-aware implicit neural representation and introducing a flow-guided polarization variation loss to supervise dynamics. The work is supported by a new large-scale benchmark of DoFP polarization videos. Sympathetic readers would care because hardware limitations currently prevent high-frame-rate polarimetric video, and this method offers a way to enhance both spatial resolution and temporal sampling without new hardware.

Core claim

We propose the first space-time polarization video reconstruction architecture. The method jointly models polarization directions in space and time and uses a polarization-aware implicit neural representation for continuous, high-fidelity upsampling. By analyzing temporal variations in polarization parameters, we further introduce a flow-guided polarization variation loss to supervise polarization dynamics. We also establish the first large-scale color DoFP polarization video benchmark to support this research direction.

What carries the argument

polarization-aware implicit neural representation for jointly modeling polarization directions in space and time

If this is right

  • Enables continuous upsampling of polarization parameters to arbitrary resolutions in space and time.
  • Allows supervision of polarization dynamics via a flow-guided variation loss.
  • Establishes a benchmark for space-time polarization video reconstruction research.
  • Shows effectiveness through experiments on the new dataset.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This method might apply to reconstructing other multi-dimensional sensor data from mosaic arrays.
  • It could support real-world applications in material identification during motion if the benchmark generalizes.
  • Future tests on datasets with extreme dynamics could help identify where the implicit representation needs improvement.

Load-bearing premise

The polarization-aware implicit neural representation and flow-guided polarization variation loss will generalize to real-world dynamic scenes beyond the new benchmark.

What would settle it

A comparison showing large discrepancies in reconstructed DoLP and AoP values on a new collection of high-speed polarization videos with ground truth.

Figures

Figures reproduced from arXiv: 2605.10275 by Boxin Shi, Chenggong Li, Degui Yang, Junchao Zhang, Yidong Luo.

Figure 1
Figure 1. Figure 1: The workflow of polarization reconstruction. Video sequences are captured by a DoFP camera, where each mosaic frame samples pixels from four [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The motivation of this work. (a) Shape from polarization under perspective projection. AoP [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of the proposed framework. (a) The workflow. E and D denote the encoder and decoder [ [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Overview of the proposed benchmark. (a) Acquisition settings. Indoor [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Visualization of different denoising strategies. The first and third rows [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Visual comparisons of demosaicking (2×) and 8× interpolation results on synthetic data. The horizontal and vertical axes represent temporal frames and methods, respectively. DoLP and AoP are jointly visualized using the HSV color space shown in (h). (a)-(b) Results of ATD [67]+VFIT [68] and PIDSR [15]+SCUBA [16], respectively. Since these VFI algorithms only support 2× upsampling, most intermediate timeste… view at source ↗
Figure 7
Figure 7. Figure 7: Visual comparisons of 4× super-resolution and 4× interpolation results on real-world data. The horizontal and vertical axes represent methods and temporal frames, respectively. DoLP and AoP are jointly visualized. The last row displays the unpolarized light I at frame 3. (a)-(b) Results of PIDSR [15]+SCUBA [16] and ZoomingSlowMo [49], respectively. Since these interpolation patterns only support 2× upsampl… view at source ↗
Figure 8
Figure 8. Figure 8: Visualizations of the ablation study on synthetic data for demosaicking ( [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Feature visualization. (a) Overlaid AoP at [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
read the original abstract

Polarimetric imaging captures surface polarization characteristics, such as the Degree of Linear Polarization (DoLP) and the Angle of Polarization (AoP). In mainstream Division of-Focal-Plane (DoFP) color polarization imaging, recovering polarization parameters from captured mosaic arrays remains a challenging inverse problem. Existing DoFP cameras also face hardware bottlenecks and often cannot support high-frame-rate acquisition, limiting polarimetric imaging in dynamic video tasks. These limitations motivate joint spatial and temporal enhancement. To this end, we propose the first space-time polarization video reconstruction architecture. The method jointly models polarization directions in space and time and uses a polarization-aware implicit neural representation for continuous, high-fidelity upsampling. By analyzing temporal variations in polarization parameters, we further introduce a flow-guided polarization variation loss to supervise polarization dynamics. We also establish the first large-scale color DoFP polarization video benchmark to support this research direction. Extensive experiments on this benchmark demonstrate the effectiveness of the method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper proposes PolarVSR, the first space-time polarization video reconstruction architecture for DoFP color polarization videos. It jointly models polarization directions in space and time via a polarization-aware implicit neural representation for continuous high-fidelity upsampling, introduces a flow-guided polarization variation loss to supervise temporal dynamics, establishes the first large-scale color DoFP polarization video benchmark, and validates the approach via extensive experiments on that benchmark.

Significance. If the results hold, the work addresses a clear gap in dynamic polarimetric imaging by enabling joint spatial-temporal enhancement beyond hardware limits of DoFP sensors. The new benchmark is a valuable, reusable contribution that can anchor future research. The polarization-aware INR and flow-guided loss are direct, technically motivated adaptations that integrate polarization-specific cues with continuous representations; these choices are strengths when paired with the dataset release.

major comments (1)
  1. [Experiments section] Experiments section: the method and all reported comparisons are evaluated exclusively on the newly introduced benchmark. This setup creates a generalization risk for the claim that the architecture demonstrates effectiveness, as the benchmark may not capture the full range of real-world dynamic polarization scenes (e.g., varying lighting, materials, or motion types). Independent test sets or cross-dataset evaluation would be needed to support broader claims.
minor comments (2)
  1. [Abstract] Abstract: the statement that 'extensive experiments demonstrate the effectiveness' would be strengthened by naming the primary metrics, number of baselines, and key quantitative gains.
  2. [Method] Method description: the precise encoding of DoLP/AoP and Stokes parameters inside the polarization-aware INR is not fully detailed; adding a short equation or diagram would improve reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive summary, recognition of the benchmark's value, and recommendation for minor revision. We address the single major comment below.

read point-by-point responses
  1. Referee: Experiments section: the method and all reported comparisons are evaluated exclusively on the newly introduced benchmark. This setup creates a generalization risk for the claim that the architecture demonstrates effectiveness, as the benchmark may not capture the full range of real-world dynamic polarization scenes (e.g., varying lighting, materials, or motion types). Independent test sets or cross-dataset evaluation would be needed to support broader claims.

    Authors: We agree that exclusive evaluation on the new benchmark carries a generalization risk, as no prior public color DoFP polarization video datasets exist for this task. The benchmark was deliberately constructed with diverse scenes spanning multiple lighting conditions, material types, and motion patterns to reduce this risk, but it cannot claim exhaustive coverage of all real-world variations. In the revised manuscript we will add an explicit limitations paragraph in the Experiments section (and a brief note in the Conclusion) acknowledging this point and stating that future cross-dataset validation will be essential once additional datasets are released. This revision clarifies the scope of our claims without altering the experimental results or core contributions. revision: partial

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces a polarization-aware implicit neural representation and flow-guided polarization variation loss as novel components to address stated DoFP video limitations, along with a new benchmark for evaluation. No equations, self-citations, or claims in the provided abstract and description reduce any prediction or result to fitted inputs or prior self-referential definitions by construction. The technical choices are presented as independent responses to hardware and inverse-problem challenges, making the derivation self-contained without load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Insufficient information from abstract alone; no specific free parameters, axioms, or invented entities can be identified.

pith-pipeline@v0.9.0 · 5477 in / 1029 out tokens · 46216 ms · 2026-05-12T05:21:30.206982+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

70 extracted references · 70 canonical work pages · 1 internal anchor

  1. [1]

    Cpifuse: Toward realistic color and enhanced textures in color polarization image fusion,

    Y . Luo, J. Zhang, and C. Li, “Cpifuse: Toward realistic color and enhanced textures in color polarization image fusion,”Information Fusion, vol. 120, p. 103111, 2025

  2. [2]

    Sharecmp: Polarization- aware rgb-p semantic segmentation,

    Z. Liu, B. Wang, L. Wang, C. Mao, and Y . Li, “Sharecmp: Polarization- aware rgb-p semantic segmentation,”IEEE Transactions on Circuits and Systems for Video Technology, 2025

  3. [3]

    Polarfree: Polarization-based reflection-free imaging,

    M. Yao, M. Wang, K.-M. Tam, L. Li, T. Xue, and J. Gu, “Polarfree: Polarization-based reflection-free imaging,” inProceedings of the Com- puter Vision and Pattern Recognition Conference, 2025, pp. 10 890– 10 899

  4. [4]

    Physics-guided reflec- tion separation from a pair of unpolarized and polarized images,

    Y . Lyu, Z. Cui, S. Li, M. Pollefeys, and B. Shi, “Physics-guided reflec- tion separation from a pair of unpolarized and polarized images,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 2, pp. 2151–2165, 2022

  5. [5]

    Polarized reflection removal with perfect alignment in the wild,

    C. Lei, X. Huang, M. Zhang, Q. Yan, W. Sun, and Q. Chen, “Polarized reflection removal with perfect alignment in the wild,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 1750–1758

  6. [6]

    Deep polarization imaging for 3d shape and svbrdf acquisition,

    V . Deschaintre, Y . Lin, and A. Ghosh, “Deep polarization imaging for 3d shape and svbrdf acquisition,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 15 567–15 576

  7. [7]

    Shape from polarization with distant lighting estimation,

    Y . Lyu, L. Zhao, S. Li, and B. Shi, “Shape from polarization with distant lighting estimation,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 11, pp. 13 991–14 004, 2023

  8. [8]

    Shape from polarization for complex scenes in the wild,

    C. Lei, C. Qi, J. Xie, N. Fan, V . Koltun, and Q. Chen, “Shape from polarization for complex scenes in the wild,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12 632–12 641

  9. [9]

    Rethinking temporal fusion with a unified gradient descent view for 3d semantic occupancy prediction,

    D. Chen, H. Zheng, J. Fang, X. Dong, X. Li, W. Liao, T. He, P. Peng, and J. Shen, “Rethinking temporal fusion with a unified gradient descent view for 3d semantic occupancy prediction,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025, pp. 1505–1515

  10. [10]

    6d object pose tracking in internet videos for robotic manipulation,

    G. Ponimatkin, M. C ´ıfka, T. Souˇcek, M. Fourmy, Y . Labb ´e, V . Petrik, and J. Sivic, “6d object pose tracking in internet videos for robotic manipulation,”arXiv preprint arXiv:2503.10307, 2025

  11. [11]

    Motionbench: Benchmarking and improving fine- grained video motion understanding for vision language models,

    W. Hong, Y . Cheng, Z. Yang, W. Wang, L. Wang, X. Gu, S. Huang, Y . Dong, and J. Tang, “Motionbench: Benchmarking and improving fine- grained video motion understanding for vision language models,” inPro- ceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 8450–8460

  12. [12]

    Slam3r: Real-time dense scene reconstruction from monocular rgb videos,

    Y . Liu, S. Dong, S. Wang, Y . Yin, Y . Yang, Q. Fan, and B. Chen, “Slam3r: Real-time dense scene reconstruction from monocular rgb videos,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 16 651–16 662

  13. [13]

    Principle investigations on polarization image sensors,

    D. Rebhan, M. Rosenberger, and G. Notni, “Principle investigations on polarization image sensors,” inPhotonics and Education in Measurement Science 2019, vol. 11144. SPIE, 2019, pp. 50–54

  14. [14]

    Collett,Field guide to polarization

    E. Collett,Field guide to polarization. SPIE press Bellingham, 2005, vol. 15

  15. [15]

    Pidsr: Complementary polarized image demosaicing and super-resolution,

    S. Zhou, C. Zhou, Y . Lyu, H. Guo, Z. Ma, B. Shi, and I. Sato, “Pidsr: Complementary polarized image demosaicing and super-resolution,” in Proceedings of the Computer Vision and Pattern Recognition Confer- ence, 2025, pp. 16 081–16 090

  16. [16]

    Polarization video frame interpolation for 3d human pose reconstruction with attention mechanism,

    X. Zhang, X. Wang, Y . Xu, X. Wu, and F. Huang, “Polarization video frame interpolation for 3d human pose reconstruction with attention mechanism,”Optics and Lasers in Engineering, vol. 193, p. 109046, 2025

  17. [17]

    Videoinr: Learning video implicit neural representation for continuous space-time super-resolution,

    Z. Chen, Y . Chen, J. Liu, X. Xu, V . Goel, Z. Wang, H. Shi, and X. Wang, “Videoinr: Learning video implicit neural representation for continuous space-time super-resolution,” inProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, 2022, pp. 2047–2057

  18. [18]

    Motif: Learning motion trajectories with local implicit neural functions for continuous space-time video super-resolution,

    Y .-H. Chen, S.-C. Chen, Y .-Y . Lin, and W.-H. Peng, “Motif: Learning motion trajectories with local implicit neural functions for continuous space-time video super-resolution,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 23 131–23 141

  19. [19]

    Bf-stvsr: B-splines and fourier— best friends for high fidelity spatial-temporal video super-resolution,

    E. Kim, H. Kim, K. H. Jin, and J. Yoo, “Bf-stvsr: B-splines and fourier— best friends for high fidelity spatial-temporal video super-resolution,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025, pp. 28 009–28 018

  20. [20]

    Spatial transformer networks,

    M. Jaderberg, K. Simonyan, A. Zissermanet al., “Spatial transformer networks,”Advances in Neural Information Processing Systems, vol. 28, 2015

  21. [21]

    Softmax splatting for video frame interpolation,

    S. Niklaus and F. Liu, “Softmax splatting for video frame interpolation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 5437–5446

  22. [22]

    Sea-raft: Simple, efficient, accurate raft for optical flow,

    Y . Wang, L. Lipson, and J. Deng, “Sea-raft: Simple, efficient, accurate raft for optical flow,” inProceedings of the European Conference on Computer Vision, 2024, pp. 36–54

  23. [23]

    Monochrome and color polarization demosaicking using edge-aware residual interpo- lation,

    M. Morimatsu, Y . Monno, M. Tanaka, and M. Okutomi, “Monochrome and color polarization demosaicking using edge-aware residual interpo- lation,” inIEEE International Conference on Image Processing. IEEE, 2020, pp. 2571–2575

  24. [24]

    Linear polarization demo- saicking for monochrome and colour polarization focal plane arrays,

    S. Qiu, Q. Fu, C. Wang, and W. Heidrich, “Linear polarization demo- saicking for monochrome and colour polarization focal plane arrays,” inComputer Graphics Forum, vol. 40, no. 6. Wiley Online Library, 2021, pp. 77–89

  25. [25]

    Convolutional demosaicing network for joint chromatic and polarimetric imagery,

    S. Wen, Y . Zheng, F. Lu, and Q. Zhao, “Convolutional demosaicing network for joint chromatic and polarimetric imagery,”Optics Letters, vol. 44, no. 22, pp. 5646–5649, 2019

  26. [26]

    Demosaicking customized diffusion model for snapshot polarization imaging,

    C. Li, Y . Luo, C. Wu, J. Zhang, D. Yang, and D. Zhao, “Demosaicking customized diffusion model for snapshot polarization imaging,”Optics & Laser Technology, vol. 188, p. 112868, 2025. ARXIV PREPRINT 14

  27. [27]

    Po- larization denoising and demosaicking: Dataset and baseline method,

    M. D. A. B. A. Rahman, Y . Monno, M. Tanaka, and M. Okutomi, “Po- larization denoising and demosaicking: Dataset and baseline method,” inIEEE International Conference on Image Processing. IEEE, 2025, pp. 2724–2729

  28. [28]

    Podb: A learning-based polari- metric object detection benchmark for road scenes in adverse weather conditions,

    Z. Zhu, X. Li, J. Zhai, and H. Hu, “Podb: A learning-based polari- metric object detection benchmark for road scenes in adverse weather conditions,”Information Fusion, vol. 108, p. 102385, 2024

  29. [29]

    Spectral and polarization vision: Spectro-polarimetric real-world dataset,

    Y . Jeon, E. Choi, Y . Kim, Y . Moon, K. Omer, F. Heide, and S.-H. Baek, “Spectral and polarization vision: Spectro-polarimetric real-world dataset,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 22 098–22 108

  30. [30]

    Demosaicking dofp images using newton’s polynomial interpolation and polarization difference model,

    N. Li, Y . Zhao, Q. Pan, and S. G. Kong, “Demosaicking dofp images using newton’s polynomial interpolation and polarization difference model,”Optics Express, vol. 27, no. 2, pp. 1376–1391, 2019

  31. [31]

    Polarization image demosaick- ing using polarization channel difference prior,

    R. Wu, Y . Zhao, N. Li, and S. G. Kong, “Polarization image demosaick- ing using polarization channel difference prior,”Optics Express, vol. 29, no. 14, pp. 22 066–22 079, 2021

  32. [32]

    A sparse representation based joint demosaicing method for single-chip polarized color sensor,

    S. Wen, Y . Zheng, and F. Lu, “A sparse representation based joint demosaicing method for single-chip polarized color sensor,”IEEE Transactions on Image Processing, vol. 30, pp. 4171–4182, 2021

  33. [33]

    Sparse representation-based demosaick- ing method for joint chromatic and polarimetric imagery,

    Y . Luo, J. Zhang, and D. Tian, “Sparse representation-based demosaick- ing method for joint chromatic and polarimetric imagery,”Optics and Lasers in Engineering, vol. 164, p. 107526, 2023

  34. [34]

    Learning a non-locally regularized convolutional sparse representation for joint chromatic and polarimetric demosaicking,

    Y . Luo, J. Zhang, J. Shao, J. Tian, and J. Ma, “Learning a non-locally regularized convolutional sparse representation for joint chromatic and polarimetric demosaicking,”IEEE Transactions on Image Processing, vol. 33, pp. 5029–5044, 2024

  35. [35]

    Learning a convolutional demosaicing network for microgrid polarime- ter imagery,

    J. Zhang, J. Shao, H. Luo, X. Zhang, B. Hui, Z. Chang, and R. Liang, “Learning a convolutional demosaicing network for microgrid polarime- ter imagery,”Optics Letters, vol. 43, no. 18, pp. 4534–4537, 2018

  36. [36]

    Color polarization demosaicking by a convolutional neural network,

    Y . Sun, J. Zhang, and R. Liang, “Color polarization demosaicking by a convolutional neural network,”Optics Letters, vol. 46, no. 17, pp. 4338–4341, 2021

  37. [37]

    Two-step color- polarization demosaicking network,

    V . Nguyen, M. Tanaka, Y . Monno, and M. Okutomi, “Two-step color- polarization demosaicking network,” inIEEE International Conference on Image Processing, 2022, pp. 1011–1015

  38. [38]

    Attention-based pro- gressive discrimination generative adversarial networks for polarimetric image demosaicing,

    Y . Guo, X. Dai, S. Wang, G. Jin, and X. Zhang, “Attention-based pro- gressive discrimination generative adversarial networks for polarimetric image demosaicing,”IEEE Transactions on Computational Imaging, vol. 10, pp. 713–725, 2024

  39. [39]

    Polarization uncertainty- guided diffusion model for color polarization image demosaicking,

    C. Li, Y . Luo, J. Zhang, and Y . Degui, “Polarization uncertainty- guided diffusion model for color polarization image demosaicking,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 8, 2026, pp. 6028–6036

  40. [40]

    Polarized image super-resolution via a deep convolutional neural network,

    H. Hu, S. Yang, X. Li, Z. Cheng, T. Liu, and J. Zhai, “Polarized image super-resolution via a deep convolutional neural network,”Optics Express, vol. 31, no. 5, pp. 8535–8547, 2023

  41. [41]

    Color polarization image super-resolution reconstruction via a cross-branch supervised learning strategy,

    D. Yu, Q. Li, Z. Zhang, G. Huo, C. Xu, and Y . Zhou, “Color polarization image super-resolution reconstruction via a cross-branch supervised learning strategy,”Optics and Lasers in Engineering, vol. 165, p. 107469, 2023

  42. [42]

    Deep polarization reconstruction with pdavis events,

    H. Mei, Z. Wang, X. Yang, X. Wei, and T. Delbruck, “Deep polarization reconstruction with pdavis events,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 22 149–22 158

  43. [43]

    Benchmarking burst super-resolution for polarization images: Noise dataset and analysis,

    I. Hwang, K. Choi, H. Ha, and M. H. Kim, “Benchmarking burst super-resolution for polarization images: Noise dataset and analysis,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025, pp. 24 899–24 909

  44. [44]

    Implicit neural representations with periodic activation functions,

    V . Sitzmann, J. Martel, A. Bergman, D. Lindell, and G. Wetzstein, “Implicit neural representations with periodic activation functions,” Advances in Neural Information Processing Systems, vol. 33, pp. 7462– 7473, 2020

  45. [45]

    Recovering realistic texture in image super-resolution by deep spatial feature transform,

    X. Wang, K. Yu, C. Dong, and C. C. Loy, “Recovering realistic texture in image super-resolution by deep spatial feature transform,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 606–615

  46. [46]

    Basicvsr++: Improving video super-resolution with enhanced propagation and alignment,

    K. C. Chan, S. Zhou, X. Xu, and C. C. Loy, “Basicvsr++: Improving video super-resolution with enhanced propagation and alignment,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5972–5981

  47. [47]

    Quadratic video interpolation,

    X. Xu, L. Siyao, W. Sun, Q. Yin, and M.-H. Yang, “Quadratic video interpolation,”Advances in Neural Information Processing Systems, vol. 32, 2019

  48. [48]

    Space-time-aware multi- resolution video enhancement,

    M. Haris, G. Shakhnarovich, and N. Ukita, “Space-time-aware multi- resolution video enhancement,” inProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, 2020, pp. 2859– 2868

  49. [49]

    Zooming slow-mo: Fast and accurate one-stage space-time video super- resolution,

    X. Xiang, Y . Tian, Y . Zhang, Y . Fu, J. P. Allebach, and C. Xu, “Zooming slow-mo: Fast and accurate one-stage space-time video super- resolution,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3370–3379

  50. [50]

    Temporal modulation network for controllable space-time video super-resolution,

    G. Xu, J. Xu, Z. Li, L. Wang, X. Sun, and M.-M. Cheng, “Temporal modulation network for controllable space-time video super-resolution,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 6388–6397

  51. [51]

    arXiv preprint arXiv:2407.07667 (2024)

    J. He, T. Xue, D. Liu, X. Lin, P. Gao, D. Lin, Y . Qiao, W. Ouyang, and Z. Liu, “Venhancer: Generative space-time enhancement for video generation,”arXiv preprint arXiv:2407.07667, 2024

  52. [52]

    Osdenhancer: Taming real-world space-time video super-resolution with one-step diffusion,

    S. Wei, F. Li, C. Zhou, R. Cong, Y . Zhao, and H. Bai, “Osdenhancer: Taming real-world space-time video super-resolution with one-step diffusion,”arXiv preprint arXiv:2601.20308, 2026

  53. [53]

    Continuous space-time video super-resolution with 3d fourier fields,

    A. Becker, J. Erbach, D. Narnhofer, and K. Schindler, “Continuous space-time video super-resolution with 3d fourier fields,”arXiv preprint arXiv:2509.26325, 2025

  54. [54]

    Polarized optical-flow gyroscope,

    M. Tzabari and Y . Y . Schechner, “Polarized optical-flow gyroscope,” in Proceedings of the European Conference on Computer Vision, 2020, pp. 363–381

  55. [55]

    Learning continuous image representa- tion with local implicit image function,

    Y . Chen, S. Liu, and X. Wang, “Learning continuous image representa- tion with local implicit image function,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 8628–8638

  56. [56]

    Image neural field diffusion models,

    Y . Chen, O. Wang, R. Zhang, E. Shechtman, X. Wang, and M. Gharbi, “Image neural field diffusion models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 8007–8017

  57. [57]

    arXiv preprint arXiv:2212.00490 , year=

    Y . Wang, J. Yu, and J. Zhang, “Zero-shot image restoration using denoising diffusion null-space model,”arXiv preprint arXiv:2212.00490, 2022

  58. [58]

    Blind image restoration via fast diffusion inversion,

    H. Chihaoui, A. Lemkhenter, and P. Favaro, “Blind image restoration via fast diffusion inversion,”Advances in Neural Information Processing Systems, vol. 37, pp. 34 513–34 532, 2024

  59. [59]

    Diffusion Posterior Sampling for General Noisy Inverse Problems

    H. Chung, J. Kim, M. T. Mccann, M. L. Klasky, and J. C. Ye, “Diffusion posterior sampling for general noisy inverse problems,”arXiv preprint arXiv:2209.14687, 2022

  60. [60]

    Image super-resolution via iterative refinement,

    C. Saharia, J. Ho, W. Chan, T. Salimans, D. J. Fleet, and M. Norouzi, “Image super-resolution via iterative refinement,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 4, pp. 4713– 4726, 2022

  61. [61]

    Invertible diffusion models for compressed sensing,

    B. Chen, Z. Zhang, W. Li, C. Zhao, J. Yu, S. Zhao, J. Chen, and J. Zhang, “Invertible diffusion models for compressed sensing,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 47, no. 5, pp. 3992– 4006, 2025

  62. [62]

    Deformable convnets v2: More deformable, better results,

    X. Zhu, H. Hu, S. Lin, and J. Dai, “Deformable convnets v2: More deformable, better results,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 9308–9316

  63. [63]

    Learning blind video temporal consistency,

    W.-S. Lai, J.-B. Huang, O. Wang, E. Shechtman, E. Yumer, and M.-H. Yang, “Learning blind video temporal consistency,” inProceedings of the European Conference on Computer Vision, 2018, pp. 170–185

  64. [64]

    The structure of images,

    J. J. Koenderink, “The structure of images,”Biological Cybernetics, vol. 50, no. 5, pp. 363–370, 1984

  65. [65]

    Guided image filtering,

    K. He, J. Sun, and X. Tang, “Guided image filtering,”IEEE transactions on pattern analysis and machine intelligence, vol. 35, no. 6, pp. 1397– 1409, 2012

  66. [66]

    Pca- based denoising method for division of focal plane polarimeters,

    J. Zhang, H. Luo, R. Liang, W. Zhou, B. Hui, and Z. Chang, “Pca- based denoising method for division of focal plane polarimeters,”Optics Express, vol. 25, no. 3, pp. 2391–2400, 2017

  67. [67]

    Atd: Improved transformer with adaptive token dictionary for image restoration,

    L. Zhang, W. Long, Y . Li, X. Zhou, X. Zhao, and S. Gu, “Atd: Improved transformer with adaptive token dictionary for image restoration,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026

  68. [68]

    Video frame interpolation with transformer,

    L. Lu, R. Wu, H. Lin, J. Lu, and J. Jia, “Video frame interpolation with transformer,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 3532–3542

  69. [69]

    Super slomo: High quality estimation of multiple intermediate frames for video interpolation,

    H. Jiang, D. Sun, V . Jampani, M.-H. Yang, E. Learned-Miller, and J. Kautz, “Super slomo: High quality estimation of multiple intermediate frames for video interpolation,” inProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, 2018, pp. 9000– 9008

  70. [70]

    Raft: Recurrent all-pairs field transforms for optical flow,

    Z. Teed and J. Deng, “Raft: Recurrent all-pairs field transforms for optical flow,” inProceedings of the European Conference on Computer Vision, 2020, pp. 402–419