arxiv: 2604.24136 · v2 · submitted 2026-04-27 · 💻 cs.CV · eess.IV

Recognition: 2 theorem links

· Lean Theorem

Bridging Restoration and Generation Manifolds in One-Step Diffusion for Real-World Super-Resolution

Shyang-En Weng , Yi-Cheng Liao , Yu-Syuan Xu , Wei-Chen Chiu , Ching-Chun Huang

Authors on Pith no claims yet

Pith reviewed 2026-05-11 00:49 UTC · model grok-4.3

classification 💻 cs.CV eess.IV

keywords real-world super-resolutionone-step diffusiondiffusion modelsimage restorationgenerative steeringmanifold inversiontexture hallucinationperception-distortion trade-off

0 comments

The pith

A one-step diffusion framework for real-world super-resolution anchors low-quality images on the correct trajectory to bridge restoration and generation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces IDaS-SR to solve the high computational cost of iterative diffusion sampling and the rigid trade-offs in existing single-step methods for enhancing degraded real-world images. It relies on the Manifold Inversion Noise Estimator to calculate a severity-aware timestep and matching inversion noise that places any input latent at the right point in the diffusion process. CHARIOT then reschedules the trajectory and interpolates noise to let the model move continuously between keeping original structure and adding new plausible details. If the approach holds, high-quality super-resolution becomes practical in a single pass rather than many iterations. Readers would care because faster inference makes advanced image enhancement usable in real applications like photography or video processing.

Core claim

IDaS-SR bridges the deterministic restoration manifold and the stochastic generation manifold in one inference step. The Manifold Inversion Noise Estimator predicts a severity-aware timestep and inversion noise that anchors arbitrary real-world low-quality latents onto the diffusion trajectory. CHARIOT enables navigation of the perception-distortion boundary by rescheduling trajectories and interpolating noise while preserving structural priors. Experiments show the method outperforms prior state-of-the-art approaches and transitions smoothly from structural restoration to texture hallucination.

What carries the argument

Manifold Inversion Noise Estimator (MINE), which predicts a severity-aware timestep and inversion noise to anchor low-quality latents onto the diffusion trajectory, together with CHARIOT, a continuous steering mechanism that reschedules trajectories and interpolates noise to control the perception-distortion balance.

If this is right

Real-world super-resolution runs in one inference step instead of repeated sampling iterations.
The perception-distortion boundary becomes explicitly navigable through noise interpolation without losing structural priors.
The method outperforms existing single-step and multi-step approaches on standard real-world benchmarks.
Seamless switching occurs between rigorous structural restoration and texture hallucination within the same model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The inversion estimation step could transfer to other diffusion-based tasks such as denoising or inpainting to reduce step counts.
Single-pass operation may support real-time enhancement on devices with limited compute by cutting total sampling time.
Further tuning of the steering interpolation might allow direct user control over detail level in enhanced outputs.

Load-bearing premise

The Manifold Inversion Noise Estimator can reliably predict a severity-aware timestep and inversion noise that correctly anchors arbitrary real-world low-quality latents onto the diffusion trajectory without introducing new artifacts or mismatches.

What would settle it

Apply the full IDaS-SR pipeline to a set of real low-resolution images with measured degradation levels and compare outputs against multi-step diffusion baselines; if the single-step results show systematic structural mismatches or added artifacts when using the predicted timestep, the anchoring claim fails.

Figures

Figures reproduced from arXiv: 2604.24136 by Ching-Chun Huang, Shyang-En Weng, Wei-Chen Chiu, Yi-Cheng Liao, Yu-Syuan Xu.

**Figure 1.** Figure 1: Effect of varying timestep initialization on RealSR [2]. All experiments utilize OSEDiff [30] as the baseline under identical settings, with "Adaptive" variants implemented similarly to InvSR [36]. Visual crops and quantitative plots (right) illustrate the trade-off between structural fidelity (PSNR),perceptual realism (CLIP-IQA), and distribution difference (FID) across different timesteps T compared to … view at source ↗

**Figure 2.** Figure 2: Conceptual comparison of manifold mapping trajectories. (a) Conventional onestep diffusion restoration [7, 22, 30], which enforces a direct, rigid mapping from LQ space to the ground-truth manifold. (b) Adaptive inversion [36] utilizing customized timesteps to better align the restoration path with the diffusion trajectory. (c) Our proposed IDaS-SR framework, which synergizes adaptive manifold mapping to… view at source ↗

**Figure 3.** Figure 3: Overview of IDaS-SR. Our framework bridges restoration and generation manifolds. MINE first predicts a instance-adaptive timestep (tˆ) and inversion noise (ϵinv) to anchor the LQ input. CHARIOT then utilizes a control scalar (s) to interpolate between anchored and noise, balancing structural fidelity and perceptual realism. 3.2 Taming Generative Diffusion into Restoration As discussed in Sec. 1, enforcing… view at source ↗

**Figure 4.** Figure 4: Visual comparison of IDaS-SR against state-of-the-art methods on real-world datasets [1, 2, 29, 35]. Please zoom in for a better view. pared to simple adapters, it ensures that IDaS-SR robustly aligns the diffusion and restoration manifolds. This alignment allows the model to deliver top-tier perceptual generation without compromising the inference speed required for practical deployment. 4.3 Ablation Stud… view at source ↗

**Figure 5.** Figure 5: Timestep distribution across content and severity. Degradation intensity levels are synthesized from DIV2K-Val [1] following the DASR [14] pipeline. Physical Interpretation of Manifold Decoupling view at source ↗

**Figure 1.** Figure 1: Visualization of IDaS-SR with CHARIOT Generative Steering. B.3 Analysis of the Learned Adaptive Anchor As established in Sec. 4.3 in the main paper, the adaptive timestep tˆfunctions as a dynamic structural anchor that is instance-aware—primarily controlled by the image itself—and finely modulated by its degradation severity. To validate the internal stability of MINE, we analyze the distribution of tˆ acr… view at source ↗

**Figure 2.** Figure 2: Predicted timestep distribution on real-world [2, 29] and synthetic [1, 14] datasets. C Additional Qualitative Results Figs. 3 and 4 provide additional qualitative comparisons between IDaS-SR and state-of-the-art diffusion baselines across various datasets. Empowered by MINE and CHARIOT, our approach successfully recovers faithful, photo-realistic details by bridging restoration and generation within a si… view at source ↗

**Figure 3.** Figure 3: Qualitative comparisons on the RealSR [2] and DrealSR [29] datasets view at source ↗

**Figure 4.** Figure 4: Qualitative comparisons on the DIV2K-Val [1] and RealPhoto60 [35] datasets view at source ↗

read the original abstract

Pretrained diffusion models have revolutionized real-world image super-resolution (Real-ISR) but suffer from computational bottlenecks due to iterative sampling. Recent single-step distillation accelerates inference but faces a stark perception-distortion trade-off due to rigid timestep initialization, distributional trajectory mismatches, and fragile stochastic modulation. To address this, we present Adaptive Inversion and Degradation-aware Sampling for Real-ISR (IDaS-SR), a one-step framework bridging the deterministic restoration and stochastic generation manifolds. At its core, the Manifold Inversion Noise Estimator (MINE) resolves these initialization and trajectory mismatches by predicting a severity-aware timestep and inversion noise, precisely anchoring low-quality latents onto the diffusion trajectory. Furthermore, to mitigate fragile stochastic modulation, we propose CHARIOT, a continuous generative steering mechanism. By rescheduling trajectories and interpolating noise, it enables explicit navigation of the perception-distortion boundary without compromising structural priors. Extensive experiments demonstrate that IDaS-SR outperforms state-of-the-art methods, seamlessly transitioning from a rigorous structural restorer to a sophisticated texture hallucinator in a single inference step.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper introduces a one-step diffusion SR method with a noise estimator and steering module to handle the perception-distortion trade-off, but its value depends on whether the estimator actually generalizes to unseen real-world degradations.

read the letter

The main takeaway is that IDaS-SR adds two named pieces—MINE to predict a severity-aware timestep and inversion noise, plus CHARIOT for continuous trajectory steering—to make one-step distilled diffusion work better for real-world super-resolution. It directly targets the initialization mismatch, trajectory gap, and fragile modulation that show up in prior single-step approaches, and the abstract frames this as bridging deterministic restoration with stochastic generation in one pass. If the experiments hold, this could matter for anyone who needs faster inference without the usual quality drop. What the paper does well is name the specific failure modes in existing one-step distillation for Real-ISR and tie concrete mechanisms to them rather than offering another generic distillation tweak. The claim of seamless transition from structural restorer to texture hallucinator is a clear goal, and the framework IDaS-SR gives a structured way to navigate that boundary. The soft spot is the load-bearing assumption that MINE can reliably map arbitrary real-world low-quality latents onto the pretrained trajectory without new artifacts or mismatches. The stress-test note is right that no independent check on MINE's prediction accuracy or generalization across degradation types is visible from the abstract, and if that estimator is off, CHARIOT cannot fully rescue the output. The abstract asserts outperformance and extensive experiments, but without seeing the actual numbers, ablations on prediction error, or tests on severe or mixed degradations, it is difficult to judge whether the central claims are supported or whether post-hoc tuning played a role. This is the kind of work that would interest people building practical diffusion pipelines for restoration rather than pure generation researchers. A reader focused on efficient Real-ISR would find the components and the stated motivation useful even if they adapt only parts of it. It deserves a serious referee because the problem is real and the proposed fixes are specific, though the review would need to press hard on validation of MINE and the quantitative support for the seamless-transition claim. I would send it to peer review.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces IDaS-SR, a one-step diffusion framework for real-world image super-resolution that bridges deterministic restoration and stochastic generation manifolds. It proposes the Manifold Inversion Noise Estimator (MINE) to predict a severity-aware timestep and inversion noise that anchors arbitrary low-quality latents onto the pretrained diffusion trajectory, and CHARIOT, a continuous steering mechanism that reschedules trajectories and interpolates noise to navigate the perception-distortion boundary. The central claim is that this enables seamless transition from structural restoration to texture hallucination in a single inference step, with extensive experiments showing outperformance over state-of-the-art methods.

Significance. If the empirical claims hold, the work would be significant for practical real-world super-resolution by reducing the inference cost of diffusion models from many steps to one while addressing the rigid initialization and trajectory mismatch issues that plague distilled one-step methods. The MINE and CHARIOT mechanisms provide a concrete way to handle variable real-world degradations without post-hoc tuning, which could influence downstream applications in image restoration where both fidelity and perceptual quality matter.

major comments (2)

[Abstract] Abstract and core method description: The claim that MINE 'precisely anchoring low-quality latents onto the diffusion trajectory' and resolves initialization/trajectory mismatches is load-bearing for the one-step bridging result. However, no quantitative validation is provided (e.g., timestep prediction error vs. degradation severity, correlation metrics, or ablation on unseen degradation combinations), leaving open whether MINE generalizes without introducing new artifacts or mismatches for arbitrary real-world LQ inputs.
The outperformance and 'seamless transition' claims rest on experiments, yet the manuscript description supplies no tables, quantitative metrics (PSNR/SSIM/LPIPS/FID), ablation studies on MINE/CHARIOT components, or error analysis. Without these, it is impossible to verify whether the results support the central claims or reflect post-hoc choices.

minor comments (1)

[Abstract] The abstract would be strengthened by including at least one key quantitative result (e.g., average improvement over a baseline) to ground the outperformance statement.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We appreciate the emphasis on empirical validation and have addressed each major comment point by point below. Revisions have been made to strengthen the presentation of supporting evidence.

read point-by-point responses

Referee: [Abstract] Abstract and core method description: The claim that MINE 'precisely anchoring low-quality latents onto the diffusion trajectory' and resolves initialization/trajectory mismatches is load-bearing for the one-step bridging result. However, no quantitative validation is provided (e.g., timestep prediction error vs. degradation severity, correlation metrics, or ablation on unseen degradation combinations), leaving open whether MINE generalizes without introducing new artifacts or mismatches for arbitrary real-world LQ inputs.

Authors: We agree that explicit quantitative validation of MINE's timestep prediction and anchoring behavior is necessary to support the central claims. In the revised manuscript we have added a dedicated analysis subsection that reports timestep prediction error (MAE) as a function of degradation severity, Pearson correlation between predicted and ground-truth severity, and ablation results on held-out degradation combinations (e.g., novel mixtures of blur, noise, and compression). Visual and quantitative error maps are included to demonstrate that no systematic new artifacts are introduced. These additions directly substantiate the generalization of MINE. revision: yes
Referee: [—] The outperformance and 'seamless transition' claims rest on experiments, yet the manuscript description supplies no tables, quantitative metrics (PSNR/SSIM/LPIPS/FID), ablation studies on MINE/CHARIOT components, or error analysis. Without these, it is impossible to verify whether the results support the central claims or reflect post-hoc choices.

Authors: We acknowledge that the initial manuscript text did not sufficiently foreground the experimental tables and ablations. The complete paper already contains the requested quantitative results: tables reporting PSNR, SSIM, LPIPS and FID on multiple real-world benchmarks, component-wise ablations isolating MINE and CHARIOT, and error analysis including perceptual trade-off curves. In the revision we have inserted explicit forward references to these tables and figures within the method and results sections so that the supporting evidence is immediately visible to readers. revision: partial

Circularity Check

0 steps flagged

No circularity: new mechanisms proposed without self-referential derivations or fitted predictions

full rationale

The paper introduces IDaS-SR with core components MINE (Manifold Inversion Noise Estimator) and CHARIOT as novel mechanisms to address initialization mismatches and stochastic modulation in one-step diffusion for Real-ISR. No equations, derivations, or parameter-fitting steps are described in the provided abstract or summary that reduce predictions back to inputs by construction. Claims rest on the design of these new estimators and steering methods rather than renaming known results, self-citations as load-bearing premises, or ansatzes smuggled via prior work. The derivation chain is self-contained as a proposal of architectural innovations, with performance claims left to empirical validation outside any definitional loop.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 2 invented entities

The abstract introduces two new named modules without exposing underlying mathematical assumptions, free parameters, or external benchmarks; therefore the ledger records the invented components as the primary additions.

invented entities (2)

Manifold Inversion Noise Estimator (MINE) no independent evidence
purpose: Predict severity-aware timestep and inversion noise to anchor low-quality latents
New component introduced to resolve initialization and trajectory mismatches
CHARIOT no independent evidence
purpose: Continuous generative steering via trajectory rescheduling and noise interpolation
New mechanism to navigate perception-distortion boundary without losing structural priors

pith-pipeline@v0.9.0 · 5505 in / 1171 out tokens · 70208 ms · 2026-05-11T00:49:59.315360+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean; IndisputableMonolith/Cost/FunctionalEquation.lean reality_from_one_distinction; washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

MINE resolves initialization and trajectory mismatches by predicting a severity-aware timestep and inversion noise, precisely anchoring low-quality latents onto the diffusion trajectory... CHARIOT... interpolating noise... navigate the perception-distortion boundary
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean; IndisputableMonolith/Foundation/ArrowOfTime.lean LogicNat; arrow_from_z unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

adaptive timestep ˆt... tmix = ˆt + s·Δt... ϵmix = Norm((1-s)ϵinv + sϵ)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 7 canonical work pages · 1 internal anchor

[1]

In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops

Agustsson, E., Timofte, R.: Ntire 2017 challenge on single image super-resolution: Dataset and study. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. pp. 126–135 (2017) 10, 11, 12, 14, 24, 26

2017
[2]

In: Proceedings of the IEEE/CVF international conference on computer vision

Cai, J., Zeng, H., Yong, H., Cao, Z., Zhang, L.: Toward real-world single im- age super-resolution: A new benchmark and a new model. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 3086–3095 (2019) 2, 10, 11, 12, 13, 14, 24, 25

2019
[3]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Chen, J., Pan, J., Dong, J.: Faithdiff: Unleashing diffusion priors for faithful image super-resolution. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 28188–28197 (2025) 4

2025
[4]

arXiv preprint arXiv:1803.07422 (2018) 10

Demir, U., Unal, G.: Patch-based image inpainting with generative adversarial networks. arXiv preprint arXiv:1803.07422 (2018) 10

work page arXiv 2018
[5]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Do, H.P., Chen, Y.W., Liao, Y.C., Hsiao, C.W., Wang, H.Y., Chiu, W.C., Huang, C.C.: Dynfacerestore: Balancing fidelity and quality in diffusion-guided blind face restoration with dynamic blur-level mapping and guidance. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 10432–10441 (2025) 7

2025
[6]

In: European conference on computer vision

Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: European conference on computer vision. pp. 184–199. Springer (2014) 4

2014
[7]

In: Proceedings of the Computer Vision and Pattern Recognition Con- ference

Dong, L., Fan, Q., Guo, Y., Wang, Z., Zhang, Q., Chen, J., Luo, Y., Zou, C.: Tsd-sr: One-step diffusion with target score distillation for real-world image super- resolution. In: Proceedings of the Computer Vision and Pattern Recognition Con- ference. pp. 23174–23184 (2025) 2, 3, 5, 8, 10, 13

2025
[8]

Advances in neural information processing systems33, 6840–6851 (2020) 4

Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Advances in neural information processing systems33, 6840–6851 (2020) 4

2020
[9]

In: International Confer- ence on Learning Representations (2022) 2

Hu, E.J., shen, y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., Chen, W.: Lora: Low-Rank Adaptation of Large Language Models. In: International Confer- ence on Learning Representations (2022) 2

2022
[10]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 4401–4410 (2019) 10

2019
[11]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., et al.: Photo-realistic single image super- resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 4681–4690 (2017) 4

2017
[12]

arXiv preprint arXiv:2502.01993 (2025)

Li, J., Cao, J., Guo, Y., Li, W., Zhang, Y.: One Diffusion Step to Real-World Super-Resolution via Flow Trajectory Distillation. In: International Conference on Machine Learning. vol. abs/2502.01993 (2025) 5, 10

work page arXiv 2025
[13]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Li, Y., Zhang, K., Liang, J., Cao, J., Liu, C., Gong, R., Zhang, Y., Tang, H., Liu, Y., Demandolx, D., et al.: Lsdir: A large scale dataset for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 1775–1787 (2023) 10

2023
[14]

In: European Conference on Computer Vision

Liang, J., Zeng, H., Zhang, L.: Efficient and degradation-adaptive network for real- world image super-resolution. In: European Conference on Computer Vision. pp. 574–591. Springer (2022) 14, 24

2022
[15]

In: Proceedings of the IEEE/CVF international conference on computer vision

Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: Image restoration using swin transformer. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 1833–1844 (2021) 4 16 S. Weng et al

2021
[16]

In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops

Lim,B.,Son,S.,Kim,H.,Nah,S.,MuLee,K.:Enhanceddeepresidualnetworksfor single image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. pp. 136–144 (2017) 4

2017
[17]

In: European conference on computer vision

Lin, X., He, J., Chen, Z., Lyu, Z., Dai, B., Yu, F., Qiao, Y., Ouyang, W., Dong, C.: Diffbir: Toward blind image restoration with generative diffusion prior. In: European conference on computer vision. pp. 430–448. Springer (2024) 2, 4

2024
[18]

ACM Transactions on Graphics44(6), 1–21 (2025) 3, 5, 8, 10, 11

Lin, X., Yu, F., Hu, J., You, Z., Shi, W., Ren, J.S., Gu, J., Dong, C.: Harness- ing Diffusion-Yielded Score Priors for Image Restoration. ACM Transactions on Graphics44(6), 1–21 (2025) 3, 5, 8, 10, 11

2025
[19]

Peebles,W.,Xie,S.:Scalablediffusionmodelswithtransformers.In:Proceedingsof the IEEE/CVF international conference on computer vision. pp. 4195–4205 (2023) 10

2023
[20]

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Podell, D., English, Z., Lacey, K., Blattmann, A., Dockhorn, T., Müller, J., Penna, J., Rombach, R.: Sdxl: Improving latent diffusion models for high-resolution image synthesis. arXiv preprint arXiv:2307.01952 (2023) 4

work page internal anchor Pith review Pith/arXiv arXiv 2023
[21]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10684–10695 (2022) 4

2022
[22]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Sun,L.,Wu,R.,Ma,Z.,Liu,S.,Yi,Q.,Zhang,L.:Pixel-levelandsemantic-levelad- justable super-resolution: A dual-lora approach. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 2333–2343 (2025) 2, 3, 5, 8, 10, 11

2025
[23]

In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops

Timofte, R., Agustsson, E., Van Gool, L., Yang, M.H., Zhang, L.: Ntire 2017 chal- lenge on single image super-resolution: Methods and results. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. pp. 114–125 (2017) 10

2017
[24]

arXiv preprint arXiv:2412.07152 (2024) 2, 7

Wang, J., Fan, Q., Zhang, Q., Liu, H., Yu, Y., Chen, J., Ren, W.: Hero-sr: One- step diffusion for super-resolution with human perception priors. arXiv preprint arXiv:2412.07152 (2024) 2, 7

work page arXiv 2024
[25]

International Journal of Computer Vision 132(12), 5929–5949 (2024) 2, 4

Wang, J., Yue, Z., Zhou, S., Chan, K.C., Loy, C.C.: Exploiting diffusion prior for real-world image super-resolution. International Journal of Computer Vision 132(12), 5929–5949 (2024) 2, 4

2024
[26]

In: Proceedings of the IEEE/CVF in- ternational conference on computer vision

Wang, X., Xie, L., Dong, C., Shan, Y.: Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In: Proceedings of the IEEE/CVF in- ternational conference on computer vision. pp. 1905–1914 (2021) 4, 11

1905
[27]

The Fourteenth International Conference on Learning Represen- tations (2025) 10

Wang, Y., Zhao, S., Zhang, K., Li, J., Zhang, L.: Gendr: Lighten Generative De- tail Restoration. The Fourteenth International Conference on Learning Represen- tations (2025) 10

2025
[28]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Wang, Y., Yang, W., Chen, X., Wang, Y., Guo, L., Chau, L.P., Liu, Z., Qiao, Y., Kot, A.C., Wen, B.: Sinsr: diffusion-based image super-resolution in a single step. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 25796–25805 (2024) 5, 11

2024
[29]

In: European conference on computer vision

Wei, P., Xie, Z., Lu, H., Zhan, Z., Ye, Q., Zuo, W., Lin, L.: Component divide- and-conquer for real-world image super-resolution. In: European conference on computer vision. pp. 101–117. Springer (2020) 10, 11, 12, 24, 25

2020
[30]

In: The Thirty-eighth Annual Conference on Neural Information Processing Systems

Wu, R., Sun, L., Ma, Z., Zhang, L.: One-Step Effective Diffusion Network for Real-World Image Super-Resolution. In: The Thirty-eighth Annual Conference on Neural Information Processing Systems. vol. abs/2406.08177 (2024) 2, 3, 5, 8, 9, 10, 11, 13 Bridging Restor. and Gener. in One-Step Diffusion for Real-ISR 17

work page arXiv 2024
[31]

In: Proceedings of the IEEE/CVF con- ference on computer vision and pattern recognition

Wu, R., Yang, T., Sun, L., Zhang, Z., Li, S., Zhang, L.: Seesr: Towards semantics- aware real-world image super-resolution. In: Proceedings of the IEEE/CVF con- ference on computer vision and pattern recognition. pp. 25456–25467 (2024) 2, 4, 10, 11

2024
[32]

arXiv preprint arXiv:2508.08227 (2025)

Wu, Z., Sun, Z., Zhou, T., Fu, B., Cong, J., Dong, Y., Zhang, H., Tang, X., Chen, M., Wei, X.: Omgsr: You Only Need One Mid-timestep Guidance for Real-World Image Super-Resolution. arXiv.orgabs/2508.08227(2025) 2, 7, 8

work page arXiv 2025
[33]

arXiv preprint arXiv:2509.10122 (2025)

Wu, Z., Zheng, S., Jiang, P.T., Yuan, X.: Realism control one-step diffusion for real-world image super-resolution. arXiv preprint arXiv:2509.10122 (2025) 2

work page arXiv 2025
[34]

In: European conference on computer vision

Yang, T., Wu, R., Ren, P., Xie, X., Zhang, L.: Pixel-aware stable diffusion for real- istic image super-resolution and personalized stylization. In: European conference on computer vision. pp. 74–91. Springer (2024) 2, 4

2024
[35]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Yu,F.,Gu,J.,Li,Z.,Hu,J.,Kong,X.,Wang,X.,He,J.,Qiao,Y.,Dong,C.:Scaling up to excellence: Practicing model scaling for photo-realistic image restoration in the wild. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 25669–25680 (2024) 4, 10, 11, 12, 26

2024
[36]

In: Proceedings of the Computer Vision and Pattern Recognition Con- ference

Yue, Z., Liao, K., Loy, C.C.: Arbitrary-steps image super-resolution via diffusion inversion. In: Proceedings of the Computer Vision and Pattern Recognition Con- ference. pp. 23153–23163 (2025) 2, 3, 5, 7, 8, 11, 13

2025
[37]

Advances in neural information processing systems 36, 13294–13307 (2023) 4, 11

Yue, Z., Wang, J., Loy, C.C.: Resshift: Efficient diffusion model for image super- resolution by residual shifting. Advances in neural information processing systems 36, 13294–13307 (2023) 4, 11

2023
[38]

In: Proceedings of the IEEE/CVF international conference on computer vision

Zhang, K., Liang, J., Van Gool, L., Timofte, R.: Designing a practical degradation model for deep blind image super-resolution. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 4791–4800 (2021) 4

2021
[39]

In: Computer Vision and Pattern Recognition

Zhang, L., You, W., Shi, K., Gu, S.: Uncertainty-guided Perturbation for Image Super-Resolution Diffusion Model. In: Computer Vision and Pattern Recognition. pp. 17980–17989 (2025) 3, 5, 11 Bridging Restoration and Generation Manifolds in One-Step Diffusion for Real-World Super-Resolution (Supplementary Materials) Shyang-En Weng1, Yi-Cheng Liao1, Yu-Syuan...

2025