Disentangling Generation and Regression in Stochastic Interpolants for Controllable Image Restoration
Pith reviewed 2026-05-21 04:37 UTC · model grok-4.3
The pith
Disentangling stochastic interpolants into independent generation and regression lets one model control the fidelity-realism trade-off in image restoration.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The stochastic interpolant process can be decomposed into independent generation and regression trajectories that share a single network and sampler, allowing any mixture ratio to be selected at inference time for controllable image restoration.
What carries the argument
DiSI disentanglement of the stochastic interpolant process into independent generation and regression components, implemented via a dual-branch U-Net style transformer and a unified sampler for arbitrary trajectories.
If this is right
- A single trained model can produce outputs anywhere along the continuum from high pixel fidelity to high perceptual realism.
- Few-step sampling remains efficient for any chosen point on the regression-to-generation spectrum.
- The same architecture works across multiple image restoration tasks without task-specific retraining.
- Conditional guidance is strengthened by a dedicated network branch while overall throughput stays high.
Where Pith is reading between the lines
- The disentanglement approach could extend to other stochastic modeling domains such as video or 3D restoration where similar fidelity-creativity tensions exist.
- Deployment pipelines might replace several specialized models with one flexible network that users tune per use case.
- Further work could test whether the independence holds when the input degradations differ substantially from the training distribution.
Load-bearing premise
The stochastic interpolant process admits a clean decomposition into independent generation and regression components that maintain their strengths when recombined without introducing artifacts or efficiency loss.
What would settle it
Train the DiSI model and check whether, at the pure-regression end of its control range, it matches the pixel accuracy of a dedicated regression baseline and, at the pure-generation end, matches the perceptual quality of a dedicated generative baseline on the same restoration task; failure at either extreme would indicate the decomposition does not fully preserve the separate advantages.
Figures
read the original abstract
Recent advances in Image Restoration (IR) have been largely driven by generative methods such as Diffusion Models and Flow Matching, which excel in synthesizing realistic textures while suffering from slow multi-step inference and compromised pixel fidelity. In contrast, classical regression-based IR methods excel precisely in these aspects, offering single-step efficiency and high pixel-level reconstruction fidelity. To bridge this gap, we propose DiSI, a unified framework that Disentangles the underlying Stochastic Interpolant process into independent generation and regression components. This decoupling endows DiSI with remarkable versatility, enabling a continuous and controllable transition from a pure regression process to a fully generative one. Technically, we instantiate this framework with two specific sampling trajectories, accompanied by a unified sampler for high-quality, few-step inference on arbitrary trajectories. Furthermore, we design a dual-branch U-Net style transformer network in pixel space, using a dedicated branch to enhance conditional guidance while ensuring high throughput. Extensive experiments demonstrate that DiSI efficiently achieves competitive results on various IR tasks, while uniquely offering the inference-time flexibility to control the distortion-perception trade-off within a single model.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes DiSI, a unified framework that disentangles the underlying stochastic interpolant process into independent generation and regression components for image restoration. This decoupling is claimed to enable a continuous, controllable transition from pure regression to fully generative processes via two specific sampling trajectories, a unified sampler for few-step inference, and a dual-branch U-Net-style transformer network in pixel space that enhances conditional guidance. Experiments are reported to show competitive results on IR tasks while allowing inference-time control of the distortion-perception trade-off within a single model.
Significance. If the decomposition is shown to be exact and free of residual coupling, the work would meaningfully bridge generative methods (strong on textures but slow) and regression methods (efficient and pixel-accurate), providing practical inference-time flexibility that is currently unavailable in a single model. The unified sampler and dual-branch architecture are presented as efficiency-preserving innovations.
major comments (2)
- [Framework description (abstract and §3)] The central claim requires that the stochastic interpolant decomposes into truly independent generation and regression trajectories whose linear combination yields artifact-free control at arbitrary mixing ratios. The abstract and framework description state two endpoint trajectories plus a unified sampler, but provide no derivation or SDE/ODE analysis demonstrating elimination of cross terms for intermediate ratios; if residual coupling remains, the continuous-control claim reduces to interpolation between the two endpoints rather than a true continuum.
- [Network architecture (§4)] The dual-branch U-Net transformer is introduced to enhance conditional guidance while maintaining throughput, yet no ablation or analysis quantifies whether the dedicated branch preserves the claimed efficiency or introduces new artifacts at intermediate mixing ratios; this is load-bearing for the versatility claim.
minor comments (2)
- [Introduction and Experiments] The abstract refers to 'extensive experiments' demonstrating competitive results; the introduction or results section should explicitly tabulate comparisons against both pure regression baselines and recent generative IR methods (e.g., diffusion/flow-matching variants) with standard metrics and inference-step counts.
- [Notation] Notation for the mixing parameter and the two trajectories should be defined once in a dedicated subsection rather than introduced piecemeal across the abstract and technical sections.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We address each major comment below with clarifications based on the framework and commit to revisions that strengthen the presentation without altering the core contributions.
read point-by-point responses
-
Referee: [Framework description (abstract and §3)] The central claim requires that the stochastic interpolant decomposes into truly independent generation and regression trajectories whose linear combination yields artifact-free control at arbitrary mixing ratios. The abstract and framework description state two endpoint trajectories plus a unified sampler, but provide no derivation or SDE/ODE analysis demonstrating elimination of cross terms for intermediate ratios; if residual coupling remains, the continuous-control claim reduces to interpolation between the two endpoints rather than a true continuum.
Authors: We appreciate the referee highlighting the need for explicit analysis of independence. In §3 the disentanglement follows directly from the stochastic interpolant definition: the process is an affine combination of the clean image and noise, with the regression trajectory given by the deterministic conditional expectation and the generation trajectory incorporating the full stochastic forcing term. The unified sampler constructs intermediate trajectories by linear interpolation of the corresponding velocity fields. Because the underlying interpolant is linear, substitution into the SDE yields an interpolated process whose Fokker-Planck equation contains no residual cross-coupling terms between the regression and generation components. We will add a concise derivation together with the relevant SDE/ODE verification for arbitrary mixing ratios in the revised §3 to make this property fully explicit. revision: yes
-
Referee: [Network architecture (§4)] The dual-branch U-Net transformer is introduced to enhance conditional guidance while maintaining throughput, yet no ablation or analysis quantifies whether the dedicated branch preserves the claimed efficiency or introduces new artifacts at intermediate mixing ratios; this is load-bearing for the versatility claim.
Authors: We agree that targeted ablations at intermediate mixing ratios are important for substantiating the efficiency and artifact-free versatility claims. The dual-branch design isolates conditional guidance in a separate path to improve modulation while keeping the overall parameter count and forward-pass cost comparable to a single-branch baseline. The current experiments report aggregate throughput and quality, but do not isolate the branch contribution across mixing ratios. In the revision we will add ablation tables and figures that measure wall-clock time, FLOPs, PSNR, and LPIPS for a range of mixing ratios using both the dual-branch model and an ablated single-branch counterpart, confirming that efficiency is preserved and no additional artifacts appear at intermediate points. revision: yes
Circularity Check
No circularity: DiSI proposes independent decomposition as novel framework without self-referential reduction
full rationale
The paper introduces DiSI as a new framework that disentangles the stochastic interpolant process into independent generation and regression components, supported by a unified sampler and dual-branch network. No equations or derivations in the provided abstract reduce any claimed prediction or result to fitted inputs or prior self-citations by construction. The central claim of continuous controllable transition rests on the proposed decomposition and experimental validation rather than tautological redefinition or load-bearing self-citation chains. This is a standard case of a self-contained proposal where the derivation does not collapse to its own inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Stochastic interpolant processes admit a meaningful decomposition into independent generation and regression components
invented entities (1)
-
DiSI framework with dual-branch U-Net transformer and unified sampler
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
DiSI process: x(r,g)=λg(αr x0 + βr x1) + γg z with GVP schedules αr,βr,λg=cos g, γg=sin g and φ=arcsin sqrt((1-ρ)/2)
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat recovery unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Two independent time variables r (regression) and g (generation) with PF-ODE dx = vr dr + vg dg
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Stochastic Interpolants: A Unifying Framework for Flows and Diffusions
Albergo, M.S., Boffi, N.M., Vanden-Eijnden, E.: Stochastic interpolants: A uni- fying framework for flows and diffusions. arXiv preprint arXiv:2303.08797 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[2]
Albergo, M.S., Goldstein, M., Boffi, N.M., Ranganath, R., Vanden-Eijnden, E.: Stochasticinterpolantswithdata-dependentcouplings.In:Proceedingsofthe41st International Conference on Machine Learning. pp. 921–937 (2024)
work page 2024
-
[3]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Blau, Y., Michaeli, T.: The perception-distortion tradeoff. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 6228–6237 (2018)
work page 2018
-
[4]
IEEE transactions on image processing25(11), 5187–5198 (2016)
Cai, B., Xu, X., Jia, K., Qing, C., Tao, D.: Dehazenet: An end-to-end system for single image haze removal. IEEE transactions on image processing25(11), 5187–5198 (2016)
work page 2016
-
[5]
In: Proceedings of the IEEE/CVF international conference on computer vision
Cai, Y., Bian, H., Lin, J., Wang, H., Timofte, R., Zhang, Y.: Retinexformer: One- stage retinex-based transformer for low-light image enhancement. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 12504–12513 (2023)
work page 2023
-
[6]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Chen, H., Wang, Y., Guo, T., Xu, C., Deng, Y., Liu, Z., Ma, S., Xu, C., Xu, C., Gao, W.: Pre-trained image processing transformer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 12299– 12310 (2021)
work page 2021
-
[7]
In: The Twelfth International Conference on Learning Representations (2024)
Chen, J., YU, J., GE, C., Yao, L., Xie, E., Wang, Z., Kwok, J., Luo, P., Lu, H., Li, Z.: Pixart-$\alpha$: Fast training of diffusion transformer for photorealistic text-to-image synthesis. In: The Twelfth International Conference on Learning Representations (2024)
work page 2024
-
[8]
Advances in neural information processing systems31(2018)
Chen, R.T., Rubanova, Y., Bettencourt, J., Duvenaud, D.K.: Neural ordinary dif- ferential equations. Advances in neural information processing systems31(2018)
work page 2018
-
[9]
IEEE transactions on pattern analysis and machine intelligence39(6), 1256–1272 (2016)
Chen, Y., Pock, T.: Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration. IEEE transactions on pattern analysis and machine intelligence39(6), 1256–1272 (2016)
work page 2016
-
[10]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Choi, J., Kim, S., Jeong, Y., Gwon, Y., Yoon, S.: Ilvr: Conditioning method for denoising diffusion probabilistic models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 14367–14376 (2021) 16 Yi Liu et al
work page 2021
-
[11]
In: The Eleventh International Con- ference on Learning Representations (2023)
Chung, H., Kim, J., Mccann, M.T., Klasky, M.L., Ye, J.C.: Diffusion posterior sampling for general noisy inverse problems. In: The Eleventh International Con- ference on Learning Representations (2023)
work page 2023
-
[12]
IEEE transactions on pattern analysis and machine intelligence45(9), 10850–10869 (2023)
Croitoru, F.A., Hondru, V., Ionescu, R.T., Shah, M.: Diffusion models in vision: A survey. IEEE transactions on pattern analysis and machine intelligence45(9), 10850–10869 (2023)
work page 2023
-
[13]
In: Forty-first International Conference on Machine Learning (2024)
Crowson, K., Baumann, S.A., Birch, A., Abraham, T.M., Kaplan, D.Z., Shippole, E.: Scalable high-resolution pixel-space image synthesis with hourglass diffusion transformers. In: Forty-first International Conference on Machine Learning (2024)
work page 2024
-
[14]
IEEE Transactions on image processing 16(8), 2080–2095 (2007)
Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising by sparse 3-d transform-domain collaborative filtering. IEEE Transactions on image processing 16(8), 2080–2095 (2007)
work page 2080
-
[15]
Advances in neural information pro- cessing systems35, 16344–16359 (2022)
Dao, T., Fu, D., Ermon, S., Rudra, A., Ré, C.: Flashattention: Fast and memory- efficient exact attention with io-awareness. Advances in neural information pro- cessing systems35, 16344–16359 (2022)
work page 2022
-
[16]
Advances in neural information processing systems34, 8780–8794 (2021)
Dhariwal, P., Nichol, A.: Diffusion models beat gans on image synthesis. Advances in neural information processing systems34, 8780–8794 (2021)
work page 2021
-
[17]
In: Forty-first international conference on machine learning (2024)
Esser, P., Kulal, S., Blattmann, A., Entezari, R., Müller, J., Saini, H., Levi, Y., Lorenz,D.,Sauer,A.,Boesel,F.,etal.:Scalingrectifiedflowtransformersforhigh- resolution image synthesis. In: Forty-first international conference on machine learning (2024)
work page 2024
-
[18]
In: The Thirteenth International Conference on Learning Representations (2025)
Frans, K., Hafner, D., Levine, S., Abbeel, P.: One step diffusion via shortcut models. In: The Thirteenth International Conference on Learning Representations (2025)
work page 2025
-
[19]
arXiv preprint arXiv:2305.05146 (2023)
Gao, H., Yang, J., Zhang, Y., Wang, N., Yang, J., Dang, D.: A mountain- shaped single-stage network for accurate image restoration. arXiv preprint arXiv:2305.05146 (2023)
-
[20]
Pattern Recognition161, 111313 (2025)
Gao, H., Zhang, Y., Yang, J., Dang, D.: Mixed hierarchy network for image restoration. Pattern Recognition161, 111313 (2025)
work page 2025
-
[21]
Mean Flows for One-step Generative Modeling
Geng, Z., Deng, M., Bai, X., Kolter, J.Z., He, K.: Mean flows for one-step gener- ative modeling. arXiv preprint arXiv:2505.13447 (2025)
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[22]
Advances in neural information processing systems27(2014)
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. Advances in neural information processing systems27(2014)
work page 2014
-
[23]
IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)
He, C., Shen, Y., Fang, C., Xiao, F., Tang, L., Zhang, Y., Zuo, W., Guo, Z., Li, X.: Diffusion models in low-level vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)
work page 2025
-
[24]
Advances in neural information processing systems30(2017)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems30(2017)
work page 2017
-
[25]
Advances in neural information processing systems33, 6840–6851 (2020)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Advances in neural information processing systems33, 6840–6851 (2020)
work page 2020
-
[26]
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[27]
IEEE transactions on pattern analysis and machine intelligence45(8), 10173–10196 (2023)
Huang, L., Qin, J., Zhou, Y., Zhu, F., Liu, L., Shao, L.: Normalization techniques in training dnns: Methodology, analysis and application. IEEE transactions on pattern analysis and machine intelligence45(8), 10173–10196 (2023)
work page 2023
-
[28]
Islam*, M.A., Jia*, S., Bruce, N.D.B.: How much position information do convo- lutional neural networks encode? In: International Conference on Learning Rep- resentations (2020) DiSI: Disentangled Stochastic Interpolant 17
work page 2020
-
[29]
IEEE transactions on image processing30, 2340–2349 (2021)
Jiang, Y., Gong, X., Liu, D., Cheng, Y., Fang, C., Shen, X., Yang, J., Zhou, P., Wang, Z.: Enlightengan: Deep light enhancement without paired supervision. IEEE transactions on image processing30, 2340–2349 (2021)
work page 2021
-
[30]
In: International Conference on Learning Representations (2018)
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for im- proved quality, stability, and variation. In: International Conference on Learning Representations (2018)
work page 2018
-
[31]
Advances in neural information processing sys- tems35, 26565–26577 (2022)
Karras, T., Aittala, M., Aila, T., Laine, S.: Elucidating the design space of diffusion-based generative models. Advances in neural information processing sys- tems35, 26565–26577 (2022)
work page 2022
-
[32]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Karras, T., Aittala, M., Lehtinen, J., Hellsten, J., Aila, T., Laine, S.: Analyzing and improving the training dynamics of diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 24174– 24184 (2024)
work page 2024
-
[33]
In: International conference on machine learning
Katharopoulos, A., Vyas, A., Pappas, N., Fleuret, F.: Transformers are rnns: Fast autoregressive transformers with linear attention. In: International conference on machine learning. pp. 5156–5165. PMLR (2020)
work page 2020
-
[34]
In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K
Kawar, B., Elad, M., Ermon, S., Song, J.: Denoising diffusion restoration mod- els. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022)
work page 2022
-
[35]
Advances in neural information processing systems 25(2012)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25(2012)
work page 2012
-
[36]
In: Proceedings oftheIEEEconferenceoncomputervisionandpatternrecognition.pp.8183–8192 (2018)
Kupyn, O., Budzan, V., Mykhailych, M., Mishkin, D., Matas, J.: Deblurgan: Blind motion deblurring using conditional adversarial networks. In: Proceedings oftheIEEEconferenceoncomputervisionandpatternrecognition.pp.8183–8192 (2018)
work page 2018
-
[37]
In: Proceedings of the IEEE/CVF international conference on computer vision
Kupyn, O., Martyniuk, T., Wu, J., Wang, Z.: Deblurgan-v2: Deblurring (orders- of-magnitude) faster and better. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 8878–8887 (2019)
work page 2019
-
[38]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., Aitken,A.,Tejani,A.,Totz,J.,Wang,Z.,etal.:Photo-realisticsingleimagesuper- resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 4681–4690 (2017)
work page 2017
-
[39]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Li, W., Lin, Z., Zhou, K., Qi, L., Wang, Y., Jia, J.: Mat: Mask-aware transformer for large hole image inpainting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10758–10768 (2022)
work page 2022
-
[40]
International Journal of Computer Vision pp
Li, X., Ren, Y., Jin, X., Lan, C., Wang, X., Zeng, W., Wang, X., Chen, Z.: Diffusion models for image restoration and enhancement: a comprehensive survey. International Journal of Computer Vision pp. 1–31 (2025)
work page 2025
-
[41]
In: Proceedings of the IEEE/CVF interna- tional conference on computer vision
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: Image restoration using swin transformer. In: Proceedings of the IEEE/CVF interna- tional conference on computer vision. pp. 1833–1844 (2021)
work page 2021
-
[42]
In: The Eleventh International Conference on Learning Representations (2023)
Lipman, Y., Chen, R.T.Q., Ben-Hamu, H., Nickel, M., Le, M.: Flow matching for generative modeling. In: The Eleventh International Conference on Learning Representations (2023)
work page 2023
-
[43]
Lipman, Y., Havasi, M., Holderrieth, P., Shaul, N., Le, M., Karrer, B., Chen, R.T., Lopez-Paz, D., Ben-Hamu, H., Gat, I.: Flow matching guide and code. arXiv preprint arXiv:2412.06264 (2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[44]
In: Proceedings of the 40th International Conference on Machine Learning
Liu, G.H., Vahdat, A., Huang, D.A., Theodorou, E.A., Nie, W., Anandkumar, A.: I2sb: image-to-image schrödinger bridge. In: Proceedings of the 40th International Conference on Machine Learning. pp. 22042–22062 (2023) 18 Yi Liu et al
work page 2023
-
[45]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Liu, J., Wang, Q., Fan, H., Wang, Y., Tang, Y., Qu, L.: Residual denoising diffu- sion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 2773–2783 (2024)
work page 2024
-
[46]
In: The Eleventh International Conference on Learning Representations (2023)
Liu, X., Gong, C., qiang liu: Flow straight and fast: Learning to generate and transfer data with rectified flow. In: The Eleventh International Conference on Learning Representations (2023)
work page 2023
-
[47]
In: The Thirteenth International Conference on Learning Representations (2025)
Lu, C., Song, Y.: Simplifying, stabilizing and scaling continuous-time consistency models. In: The Thirteenth International Conference on Learning Representations (2025)
work page 2025
-
[48]
Advances in neural information processing systems35, 5775–5787 (2022)
Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., Zhu, J.: Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in neural information processing systems35, 5775–5787 (2022)
work page 2022
-
[49]
Machine Intelligence Research pp
Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., Zhu, J.: Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models. Machine Intelligence Research pp. 1–22 (2025)
work page 2025
-
[50]
Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Van Gool, L.: Repaint:Inpaintingusingdenoisingdiffusionprobabilisticmodels.In:Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 11461–11471 (2022)
work page 2022
-
[51]
arXiv preprint arXiv:2505.16733 (2025)
Luo, Z., Gustafsson, F.K., Sjölund, J., Schön, T.B.: Forward-only diffusion prob- abilistic models. arXiv preprint arXiv:2505.16733 (2025)
-
[52]
In: Proceedings of the 40th International Conference on Machine Learning
Luo, Z., Gustafsson, F.K., Zhao, Z., Sjölund, J., Schön, T.B.: Image restoration with mean-reverting stochastic differential equations. In: Proceedings of the 40th International Conference on Machine Learning. pp. 23045–23066 (2023)
work page 2023
-
[53]
In: Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recogni- tion
Luo, Z., Gustafsson, F.K., Zhao, Z., Sjölund, J., Schön, T.B.: Refusion: Enabling large-size realistic image restoration with latent-space diffusion models. In: Pro- ceedings of the IEEE/CVF conference on computer vision and pattern recogni- tion. pp. 1680–1691 (2023)
work page 2023
-
[54]
In: The Twelfth International Conference on Learning Representations (2024)
Luo, Z., Gustafsson, F.K., Zhao, Z., Sjölund, J., Schön, T.B.: Controlling vision- language models for multi-task image restoration. In: The Twelfth International Conference on Learning Representations (2024)
work page 2024
-
[55]
In: European Conference on Computer Vision
Ma, N., Goldstein, M., Albergo, M.S., Boffi, N.M., Vanden-Eijnden, E., Xie, S.: Sit: Exploring flow and diffusion-based generative models with scalable inter- polant transformers. In: European Conference on Computer Vision. pp. 23–40. Springer (2024)
work page 2024
-
[56]
Nah,S.,HyunKim,T.,MuLee,K.:Deepmulti-scaleconvolutionalneuralnetwork fordynamicscenedeblurring.In:ProceedingsoftheIEEEconferenceoncomputer vision and pattern recognition. pp. 3883–3891 (2017)
work page 2017
-
[57]
In: International conference on machine learning
Nichol, A.Q., Dhariwal, P.: Improved denoising diffusion probabilistic models. In: International conference on machine learning. pp. 8162–8171. PMLR (2021)
work page 2021
-
[58]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context en- coders: Feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2536–2544 (2016)
work page 2016
-
[59]
In: Proceedings of the IEEE/CVF international conference on computer vision
Peebles, W., Xie, S.: Scalable diffusion models with transformers. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 4195–4205 (2023)
work page 2023
-
[60]
In: Proceedings of the IEEE/CVF international conference on computer vision
Qiu, Y., Zhang, K., Wang, C., Luo, W., Li, H., Jin, Z.: Mb-taylorformer: Multi- branch efficient transformer expanded by taylor formula for image dehazing. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 12802–12813 (2023) DiSI: Disentangled Stochastic Interpolant 19
work page 2023
-
[61]
In: Proceedings of the IEEE/CVF confer- ence on computer vision and pattern recognition
Ren, D., Zuo, W., Hu, Q., Zhu, P., Meng, D.: Progressive image deraining net- works: A better and simpler baseline. In: Proceedings of the IEEE/CVF confer- ence on computer vision and pattern recognition. pp. 3937–3946 (2019)
work page 2019
-
[62]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10684–10695 (2022)
work page 2022
-
[63]
In: International Conference on Medical image comput- ing and computer-assisted intervention
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomed- ical image segmentation. In: International Conference on Medical image comput- ing and computer-assisted intervention. pp. 234–241. Springer (2015)
work page 2015
-
[64]
In: ACM SIGGRAPH 2022 confer- ence proceedings
Saharia,C.,Chan,W.,Chang,H.,Lee,C.,Ho,J.,Salimans,T.,Fleet,D.,Norouzi, M.: Palette: Image-to-image diffusion models. In: ACM SIGGRAPH 2022 confer- ence proceedings. pp. 1–10 (2022)
work page 2022
-
[65]
IEEE transactions on pattern analysis and machine intelligence45(4), 4713–4726 (2022)
Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. IEEE transactions on pattern analysis and machine intelligence45(4), 4713–4726 (2022)
work page 2022
-
[66]
Särkkä, S., Solin, A.: Applied stochastic differential equations, vol. 10. Cambridge University Press (2019)
work page 2019
-
[67]
GLU Variants Improve Transformer
Shazeer, N.: Glu variants improve transformer. arXiv preprint arXiv:2002.05202 (2020)
work page internal anchor Pith review Pith/arXiv arXiv 2002
-
[68]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A.P., Bishop, R., Rueckert, D., Wang, Z.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1874–1883 (2016)
work page 2016
-
[69]
In: Interna- tional Conference on Learning Representations (2021)
Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. In: Interna- tional Conference on Learning Representations (2021)
work page 2021
-
[70]
In: Proceed- ings of the 40th International Conference on Machine Learning
Song, Y., Dhariwal, P., Chen, M., Sutskever, I.: Consistency models. In: Proceed- ings of the 40th International Conference on Machine Learning. pp. 32211–32252 (2023)
work page 2023
-
[71]
Advances in neural information processing systems32(2019)
Song, Y., Ermon, S.: Generative modeling by estimating gradients of the data distribution. Advances in neural information processing systems32(2019)
work page 2019
-
[72]
In: In- ternational Conference on Learning Representations (2021)
Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score-based generative modeling through stochastic differential equations. In: In- ternational Conference on Learning Representations (2021)
work page 2021
-
[73]
Neurocomputing568, 127063 (2024)
Su, J., Ahmed, M., Lu, Y., Pan, S., Bo, W., Liu, Y.: Roformer: Enhanced trans- former with rotary position embedding. Neurocomputing568, 127063 (2024)
work page 2024
-
[74]
Neurocomputing487, 46–65 (2022)
Su, J., Xu, B., Yin, H.: A survey of deep learning approaches to image restoration. Neurocomputing487, 46–65 (2022)
work page 2022
-
[75]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the incep- tion architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2818–2826 (2016)
work page 2016
-
[76]
In: International Conference on Learning Representations (ICLR 2016)
Theis, L., van den Oord, A., Bethge, M.: A note on the evaluation of generative models. In: International Conference on Learning Representations (ICLR 2016). pp. 1–10 (2016)
work page 2016
-
[77]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Tu, Z., Talebi, H., Zhang, H., Yang, F., Milanfar, P., Bovik, A., Li, Y.: Maxim: Multi-axis mlp for image processing. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 5769–5780 (2022)
work page 2022
-
[78]
Promptir: Prompting for all-in-one blind image restoration
Vaishnav, P., Syed Waqas, Z., Salman, K., Fahad Shahbaz, K.: Promptir: Prompt- ing for all-in-one blind image restoration. arXiv preprint arXiv:2306.13090 (2023)
-
[79]
Advances in neural informa- tion processing systems30(2017) 20 Yi Liu et al
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural informa- tion processing systems30(2017) 20 Yi Liu et al
work page 2017
-
[80]
International Journal of Computer Vision 132(12), 5929–5949 (2024)
Wang, J., Yue, Z., Zhou, S., Chan, K.C., Loy, C.C.: Exploiting diffusion prior for real-world image super-resolution. International Journal of Computer Vision 132(12), 5929–5949 (2024)
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.