pith. machine review for the scientific record. sign in

arxiv: 2605.00605 · v1 · submitted 2026-05-01 · 💻 cs.CV

Recognition: unknown

Faithful Extreme Image Rescaling with Learnable Reversible Transformation and Semantic Priors

Authors on Pith no claims yet

Pith reviewed 2026-05-09 18:52 UTC · model grok-4.3

classification 💻 cs.CV
keywords extreme image rescalingdiffusion modelsreversible transformationsemantic priorsimage super-resolutionlatent spaceperceptual qualitydetail prior
0
0 comments X

The pith

FaithEIR uses a learnable reversible transformation and adaptive detail priors in a diffusion framework to achieve faithful extreme image rescaling at 16x and higher factors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that extreme rescaling suffers from information loss that breaks semantic structures and realistic details, and proposes a diffusion-based solution to recover them more consistently. It introduces a learnable reversible transformation inspired by singular value decomposition to support invertible downscaling and upscaling directly in latent space. An adaptive detail prior supplies missing high-frequency information by drawing on average structures observed in training data, while a lightweight semantic embedder conditions the diffusion process on pixel-level semantics from the input. These elements together target higher reconstruction fidelity and perceptual quality than prior methods. If effective, the approach would allow reliable enlargement of low-resolution images in domains where detail preservation matters, such as surveillance or archival work.

Core claim

FaithEIR is a diffusion-based framework for extreme image rescaling that develops a learnable reversible transformation to enable invertible downscaling and upscaling in the latent space, proposes an adaptive detail prior as a high-frequency dictionary capturing empirical average structures from training data to offset quantization loss, and employs a lightweight pixel semantic embedder to supply semantic conditioning to a pretrained diffusion model, yielding superior reconstruction fidelity and perceptual quality.

What carries the argument

The learnable reversible transformation, which performs invertible operations in latent space inspired by singular value decomposition, paired with the adaptive detail prior that acts as a data-derived high-frequency dictionary to compensate for lost information.

If this is right

  • Semantic structures remain more consistent under scaling factors of 16x or higher compared with existing rescaling techniques.
  • Perceptual quality improves because the detail prior supplies realistic high-frequency content conditioned on the input.
  • The reversible latent-space mapping reduces the severity of the ill-posed mapping problem that normally arises in extreme rescaling.
  • A pretrained diffusion model can be reused for rescaling once equipped with the semantic embedder and detail prior.
  • Reconstruction metrics and visual inspection both show gains over state-of-the-art methods across multiple test sets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same reversible transformation might reduce error accumulation when chaining multiple image-processing steps that each involve downsampling.
  • If the adaptive prior proves robust, similar dictionary-style compensation could be added to other diffusion-based inverse problems such as inpainting or denoising.
  • The semantic embedder's lightweight design suggests it could be swapped for task-specific conditioning signals without retraining the entire diffusion backbone.

Load-bearing premise

The adaptive detail prior built from training data averages will generalize to new test images without introducing domain artifacts or breaking semantic consistency.

What would settle it

Applying the method to images drawn from a domain distant from the training distribution and observing either semantic distortions or high-frequency artifacts that exceed those produced by simpler baselines.

Figures

Figures reproduced from arXiv: 2605.00605 by Ajmal Mian, Chenyang Ge, Hao Wei, Saeed Anwar, Yanhui Zhou.

Figure 1
Figure 1. Figure 1: Comparison with state-of-the-art rescaling methods. (b)- [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of FaithEIR. The VAE encoder maps the HR input into latent features. A latent rescaling module performs downscaling [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Macro-architecture comparison between previous methods [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visual comparisons with state-of-the-art methods on the LSDIR-val dataset ( [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Visual comparisons with state-of-the-art methods on the CLIC2020 dataset ( [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Effectiveness of LRT and ADP. (a)-(e) and (g)-(k) are gen [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Visualization of different ADP values. (a) w/o SP (b) w/ LTE (c) w/ SP (d) HR patch [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: Failure case: scenarios where our method produces se [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗
read the original abstract

Most recent extreme rescaling methods struggle to preserve semantically consistent structures and produce realistic details, due to the severely ill-posed nature of low- to high-resolution mapping under scaling factors of $16\times$ or higher. To alleviate the above problems, we propose FaithEIR, a diffusion-based framework for extreme image rescaling. Inspired by singular value decomposition, we develop learnable reversible transformation that enables invertible downscaling and upscaling in the latent space. To compensate for information loss due to quantization, we propose an adaptive detail prior, a high-frequency dictionary that captures the empirical average of commonly occurring structures in the training data. Finally, we design a lightweight pixel semantic embedder to provide semantic conditioning for the pretrained diffusion model. We present extensive experimental results demonstrating that our FaithEIR consistently outperforms state-of-the-art methods, achieving superior reconstruction fidelity and perceptual quality. Our code, model weights, and detailed results are released at https://github.com/cshw2021/FaithEIR.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces FaithEIR, a diffusion-based framework for extreme image rescaling at factors of 16× and higher. It proposes a learnable reversible transformation (inspired by SVD) for invertible downscaling and upscaling in latent space, an adaptive detail prior consisting of a high-frequency dictionary built from empirical averages of training-data structures to offset quantization loss, and a lightweight pixel semantic embedder to supply semantic conditioning to a pretrained diffusion model. The central claim is that this combination yields superior reconstruction fidelity and perceptual quality over state-of-the-art methods, supported by extensive experiments and public release of code and weights.

Significance. If the empirical gains hold under rigorous validation, the work could advance extreme rescaling by mitigating information loss through reversible latent transforms and data-driven high-frequency compensation while preserving semantics. The public release of code, model weights, and detailed results is a clear strength that enables reproducibility and extension by the community.

major comments (2)
  1. [Abstract and §3] Abstract and §3 (method): The adaptive detail prior is constructed solely as the empirical average of commonly occurring high-frequency structures in the training data. No mechanism is described that guarantees transfer to test images whose texture or semantic statistics differ from the training distribution; at 16×+ scales this risks injecting domain artifacts into the diffusion conditioning and undermining the claimed fidelity and consistency gains. An ablation on out-of-distribution test sets or a quantitative measure of prior-test mismatch is required to support the generalization assumption.
  2. [§4] §4 (experiments): The abstract asserts consistent outperformance in reconstruction fidelity and perceptual quality, yet the provided description does not detail the precise metrics (PSNR/SSIM/LPIPS/FID), the full set of baselines, or ablations that isolate the reversible transform from the detail prior. Without these, it is impossible to determine whether the reported superiority is robust or sensitive to post-hoc choices in training or evaluation.
minor comments (2)
  1. [§4] The scaling range tested (exact factors beyond 16×) and any degradation at higher factors should be explicitly tabulated or plotted in the experimental section for clarity.
  2. [§3] Notation for the learnable reversible transformation parameters and the detail-prior dictionary should be introduced with explicit equations in §3 to avoid ambiguity when describing invertibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, clarifying our approach and proposing specific revisions to the manuscript where needed.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (method): The adaptive detail prior is constructed solely as the empirical average of commonly occurring high-frequency structures in the training data. No mechanism is described that guarantees transfer to test images whose texture or semantic statistics differ from the training distribution; at 16×+ scales this risks injecting domain artifacts into the diffusion conditioning and undermining the claimed fidelity and consistency gains. An ablation on out-of-distribution test sets or a quantitative measure of prior-test mismatch is required to support the generalization assumption.

    Authors: We thank the referee for this important observation on generalization. The adaptive detail prior is intentionally constructed from empirical averages of high-frequency structures that recur across diverse natural images, and the semantic conditioning via the pixel embedder is designed to modulate its application to the specific content of each test image. Nevertheless, we agree that explicit validation beyond the training distribution is valuable. In the revised manuscript we will add an ablation on out-of-distribution test sets (e.g., medical and artistic images) together with a quantitative measure of prior-test mismatch (average cosine distance between the learned dictionary and high-frequency residuals of the test images). revision: yes

  2. Referee: [§4] §4 (experiments): The abstract asserts consistent outperformance in reconstruction fidelity and perceptual quality, yet the provided description does not detail the precise metrics (PSNR/SSIM/LPIPS/FID), the full set of baselines, or ablations that isolate the reversible transform from the detail prior. Without these, it is impossible to determine whether the reported superiority is robust or sensitive to post-hoc choices in training or evaluation.

    Authors: We apologize for any lack of clarity in the experimental presentation. The full paper evaluates reconstruction fidelity with PSNR and SSIM and perceptual quality with LPIPS and FID, comparing against a comprehensive set of state-of-the-art baselines. We have also conducted component-wise ablations that isolate the learnable reversible transformation from the adaptive detail prior. In the revision we will expand §4 to explicitly enumerate all metrics, list every baseline, and insert a dedicated ablation table that reports the incremental contribution of each module. This will make the robustness of the results transparent. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected.

full rationale

The paper proposes a diffusion-based rescaling framework whose components (learnable reversible transform inspired by SVD, adaptive detail prior as empirical high-frequency dictionary from training data, semantic embedder) are assembled from external mathematical concepts and data-driven training rather than defined in terms of the target fidelity metric. Performance claims rest on experimental comparisons to SOTA methods on held-out test images, not on any fitted parameter or self-citation that is then renamed as a prediction. No load-bearing step reduces by construction to its own inputs; the derivation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 2 invented entities

The central claim rests on the effectiveness of three introduced components whose parameters are learned from data; no first-principles derivation is claimed.

free parameters (2)
  • learnable reversible transformation parameters
    Parameters of the SVD-inspired invertible mapping are optimized during training.
  • adaptive detail prior dictionary
    High-frequency dictionary entries are computed as empirical averages from the training set.
axioms (1)
  • domain assumption A learnable transformation inspired by singular value decomposition can achieve invertible downscaling and upscaling in latent space.
    Invoked to motivate the reversible transformation component.
invented entities (2)
  • adaptive detail prior no independent evidence
    purpose: High-frequency dictionary to compensate for quantization-induced information loss.
    New component proposed to capture common structures from training data.
  • pixel semantic embedder no independent evidence
    purpose: Lightweight module to supply semantic conditioning to the pretrained diffusion model.
    New conditioning mechanism introduced for the rescaling task.

pith-pipeline@v0.9.0 · 5481 in / 1383 out tokens · 46473 ms · 2026-05-09T18:52:54.222813+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 4 canonical work pages · 2 internal anchors

  1. [1]

    Ntire 2017 challenge on single image super- resolution: Dataset and study

    [Agustsson and Timofte, 2017] Eirikur Agustsson and Radu Timofte. Ntire 2017 challenge on single image super- resolution: Dataset and study. InProceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 126–135,

  2. [2]

    Dsslic: Deep semantic segmentation-based layered image compression

    [Akbariet al., 2019 ] Mohammad Akbari, Jie Liang, and Jingning Han. Dsslic: Deep semantic segmentation-based layered image compression. InICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2042–2046. IEEE,

  3. [3]

    Plug-and-play tri-branch invertible block for image rescaling

    [Baoet al., 2025 ] Jingwei Bao, Jinhua Hao, Pengcheng Xu, Ming Sun, Chao Zhou, and Shuyuan Zhu. Plug-and-play tri-branch invertible block for image rescaling. InPro- ceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 1826–1834,

  4. [4]

    Demystifying MMD GANs

    [Bi´nkowskiet al., 2018 ] Mikołaj Bi ´nkowski, Danica J Sutherland, Michael Arbel, and Arthur Gretton. Demys- tifying mmd gans.arXiv preprint arXiv:1801.01401,

  5. [5]

    Adver- sarial diffusion compression for real-world image super- resolution

    [Chenet al., 2025 ] Bin Chen, Gehui Li, Rongyuan Wu, Xin- dong Zhang, Jie Chen, Jian Zhang, and Lei Zhang. Adver- sarial diffusion compression for real-world image super- resolution. InProceedings of the Computer Vision and Pat- tern Recognition Conference, pages 28208–28220,

  6. [6]

    Image quality assessment: Unifying structure and texture similarity.IEEE transactions on pat- tern analysis and machine intelligence, 44(5):2567–2581,

    [Dinget al., 2020 ] Keyan Ding, Kede Ma, Shiqi Wang, and Eero P Simoncelli. Image quality assessment: Unifying structure and texture similarity.IEEE transactions on pat- tern analysis and machine intelligence, 44(5):2567–2581,

  7. [7]

    Taming transformers for high-resolution image synthesis

    [Esseret al., 2021 ] Patrick Esser, Robin Rombach, and Bjorn Ommer. Taming transformers for high-resolution image synthesis. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 12873–12883,

  8. [8]

    Oscar: One-step diffusion codec across multiple bit-rates.arXiv preprint arXiv:2505.16091,

    [Guoet al., 2025 ] Jinpei Guo, Yifei Ji, Zheng Chen, Kai Liu, Min Liu, Wang Rao, Wenbo Li, Yong Guo, and Yulun Zhang. Oscar: One-step diffusion codec across multiple bit-rates.arXiv preprint arXiv:2505.16091,

  9. [9]

    Gans trained by a two time-scale update rule converge to a local nash equilibrium.Advances in neural information processing systems, 30,

    [Heuselet al., 2017 ] Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochre- iter. Gans trained by a two time-scale update rule converge to a local nash equilibrium.Advances in neural information processing systems, 30,

  10. [10]

    Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851,

    [Hoet al., 2020 ] Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851,

  11. [11]

    An- alyzing and improving the image quality of stylegan

    [Karraset al., 2020 ] Tero Karras, Samuli Laine, Miika Ait- tala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. An- alyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 8110–8119,

  12. [12]

    Musiq: Multi-scale im- age quality transformer

    [Keet al., 2021 ] Junjie Ke, Qifei Wang, Yilin Wang, Pey- man Milanfar, and Feng Yang. Musiq: Multi-scale im- age quality transformer. InProceedings of the IEEE/CVF international conference on computer vision, pages 5148– 5157,

  13. [13]

    Task-aware image downscaling

    [Kimet al., 2018 ] Heewon Kim, Myungsub Choi, Bee Lim, and Kyoung Mu Lee. Task-aware image downscaling. In Proceedings of the European conference on computer vi- sion (ECCV), pages 399–414,

  14. [14]

    Glow: Generative flow with invertible 1x1 con- volutions.Advances in neural information processing sys- tems, 31,

    [Kingma and Dhariwal, 2018] Durk P Kingma and Prafulla Dhariwal. Glow: Generative flow with invertible 1x1 con- volutions.Advances in neural information processing sys- tems, 31,

  15. [15]

    Adam: A Method for Stochastic Optimization

    [Kingma, 2014] Diederik P Kingma. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980,

  16. [16]

    Learning a convolutional neural network for image compact-resolution.IEEE Transactions on Im- age Processing, 28(3):1092–1107,

    [Liet al., 2018 ] Yue Li, Dong Liu, Houqiang Li, Li Li, Zhu Li, and Feng Wu. Learning a convolutional neural network for image compact-resolution.IEEE Transactions on Im- age Processing, 28(3):1092–1107,

  17. [17]

    Lsdir: A large scale dataset for image restoration

    [Liet al., 2023 ] Yawei Li, Kai Zhang, Jingyun Liang, Jiezhang Cao, Ce Liu, Rui Gong, Yulun Zhang, Hao Tang, Yun Liu, Denis Demandolx, et al. Lsdir: A large scale dataset for image restoration. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1775–1787,

  18. [18]

    Rdeic: Accelerating diffusion-based extreme image compression with relay residual diffusion.IEEE Transactions on Circuits and Sys- tems for Video Technology,

    [Liet al., 2025 ] Zhiyuan Li, Yanhui Zhou, Hao Wei, Chenyang Ge, and Ajmal Mian. Rdeic: Accelerating diffusion-based extreme image compression with relay residual diffusion.IEEE Transactions on Circuits and Sys- tems for Video Technology,

  19. [19]

    Hierarchical conditional flow: A unified framework for image super-resolution and image rescaling

    [Lianget al., 2021 ] Jingyun Liang, Andreas Lugmayr, Kai Zhang, Martin Danelljan, Luc Van Gool, and Radu Tim- ofte. Hierarchical conditional flow: A unified framework for image super-resolution and image rescaling. InPro- ceedings of the IEEE/CVF international conference on computer vision, pages 4076–4085,

  20. [20]

    Enhanced deep resid- ual networks for single image super-resolution

    [Limet al., 2017 ] Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. Enhanced deep resid- ual networks for single image super-resolution. InPro- ceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 136–144,

  21. [21]

    Diffbir: Toward blind image restoration with generative diffusion prior

    [Linet al., 2024 ] Xinqi Lin, Jingwen He, Ziyan Chen, Zhaoyang Lyu, Bo Dai, Fanghua Yu, Yu Qiao, Wanli Ouyang, and Chao Dong. Diffbir: Toward blind image restoration with generative diffusion prior. InEuropean conference on computer vision, pages 430–448. Springer,

  22. [22]

    Dinov2: Learning robust vi- sual features without supervision.Transactions on Ma- chine Learning Research Journal, pages 1–31,

    [Oquabet al., 2024 ] Maxime Oquab, Timoth´ee Darcet, Th´eo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khali- dov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust vi- sual features without supervision.Transactions on Ma- chine Learning Research Journal, pages 1–31,

  23. [23]

    Adversarial diffusion distillation

    [Saueret al., 2024 ] Axel Sauer, Dominik Lorenz, Andreas Blattmann, and Robin Rombach. Adversarial diffusion distillation. InEuropean Conference on Computer Vision, pages 87–103. Springer,

  24. [24]

    Real-time single im- age and video super-resolution using an efficient sub-pixel convolutional neural network

    [Shiet al., 2016 ] Wenzhe Shi, Jose Caballero, Ferenc Husz´ar, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. Real-time single im- age and video super-resolution using an efficient sub-pixel convolutional neural network. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 1874–1883,

  25. [25]

    Learned image downscaling for upscaling using content adaptive resampler.IEEE Transactions on Image Process- ing, 29:4027–4040,

    [Sun and Chen, 2020] Wanjie Sun and Zhenzhong Chen. Learned image downscaling for upscaling using content adaptive resampler.IEEE Transactions on Image Process- ing, 29:4027–4040,

  26. [26]

    Clic 2020: Chal- lenge on learned image compression.Retrieved March, 29:2021,

    [Todericiet al., 2020 ] George Toderici, Lucas Theis, Nick Johnston, Eirikur Agustsson, Fabian Mentzer, Johannes Ball´e, Wenzhe Shi, and Radu Timofte. Clic 2020: Chal- lenge on learned image compression.Retrieved March, 29:2021,

  27. [27]

    Image quality assessment: from error visibility to structural similarity.IEEE transac- tions on image processing, 13(4):600–612,

    [Wanget al., 2004 ] Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. Image quality assessment: from error visibility to structural similarity.IEEE transac- tions on image processing, 13(4):600–612,

  28. [28]

    Real-esrgan: Training real-world blind super-resolution with pure synthetic data

    [Wanget al., 2021 ] Xintao Wang, Liangbin Xie, Chao Dong, and Ying Shan. Real-esrgan: Training real-world blind super-resolution with pure synthetic data. InProceedings of the IEEE/CVF international conference on computer vi- sion, pages 1905–1914,

  29. [29]

    Timestep-aware diffusion model for ex- treme image rescaling

    [Wanget al., 2025 ] Ce Wang, Zhenyu Hu, Wanjie Sun, and Zhenzhong Chen. Timestep-aware diffusion model for ex- treme image rescaling. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 15594–15603,

  30. [30]

    Toward extreme image rescal- ing with generative prior and invertible prior.IEEE Trans- actions on Circuits and Systems for Video Technology, 34(7):6181–6193,

    [Weiet al., 2024 ] Hao Wei, Chenyang Ge, Zhiyuan Li, Xin Qiao, and Pengchao Deng. Toward extreme image rescal- ing with generative prior and invertible prior.IEEE Trans- actions on Circuits and Systems for Video Technology, 34(7):6181–6193,

  31. [31]

    A lightweight model for perceptual image compression via implicit priors.Neural Networks, page 108279,

    [Weiet al., 2025 ] Hao Wei, Yanhui Zhou, Yiwen Jia, Chenyang Ge, Saeed Anwar, and Ajmal Mian. A lightweight model for perceptual image compression via implicit priors.Neural Networks, page 108279,

  32. [32]

    Seesr: To- wards semantics-aware real-world image super-resolution

    [Wuet al., 2024 ] Rongyuan Wu, Tao Yang, Lingchen Sun, Zhengqiang Zhang, Shuai Li, and Lei Zhang. Seesr: To- wards semantics-aware real-world image super-resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 25456–25467,

  33. [33]

    Invertible rescal- ing network and its extensions.International Journal of Computer Vision, 131(1):134–159,

    [Xiaoet al., 2023 ] Mingqing Xiao, Shuxin Zheng, Chang Liu, Zhouchen Lin, and Tie-Yan Liu. Invertible rescal- ing network and its extensions.International Journal of Computer Vision, 131(1):134–159,

  34. [34]

    Self-asymmetric invertible net- work for compression-aware image rescaling

    [Yanget al., 2023 ] Jinhai Yang, Mengxi Guo, Shijie Zhao, Junlin Li, and Li Zhang. Self-asymmetric invertible net- work for compression-aware image rescaling. InProceed- ings of the AAAI Conference on Artificial Intelligence, vol- ume 37, pages 3155–3163,

  35. [35]

    Degradation-guided one-step im- age super-resolution with diffusion priors.arXiv preprint arXiv:2409.17058, 2024

    [Zhanget al., 2024 ] Aiping Zhang, Zongsheng Yue, Ren- jing Pei, Wenqi Ren, and Xiaochun Cao. Degradation- guided one-step image super-resolution with diffusion pri- ors.arXiv preprint arXiv:2409.17058,

  36. [36]

    Faith- ful extreme rescaling via generative prior reciprocated in- vertible representations

    [Zhonget al., 2022 ] Zhixuan Zhong, Liangyu Chai, Yang Zhou, Bailin Deng, Jia Pan, and Shengfeng He. Faith- ful extreme rescaling via generative prior reciprocated in- vertible representations. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5708–5717,

  37. [37]

    High- frequency normalizing flow for image rescaling.IEEE Transactions on Image Processing, 32:6223–6233, 2022

    [Zhuet al., 2022 ] Yiming Zhu, Cairong Wang, Chenyu Dong, Ke Zhang, Hongyang Gao, and Chun Yuan. High- frequency normalizing flow for image rescaling.IEEE Transactions on Image Processing, 32:6223–6233, 2022