Recognition: 3 theorem links
· Lean TheoremEdge-Efficient Image Restoration: Transformer Distillation into State-Space Models
Pith reviewed 2026-05-08 18:09 UTC · model grok-4.3
The pith
Distilling transformer features into state-space models and using multi-objective search yields hybrid networks up to 3.4 times faster on edge CPUs for image restoration.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Lightweight state-space model blocks trained as feature-distilled surrogates of transformer blocks can be combined through Efficient Network Search into hybrid U-Net architectures that optimize restoration quality while penalizing transformer usage, delivering up to 3.4 times faster inference on Snapdragon 8 Elite CPU for deblurring, 1.74 times faster for deraining, and 1.17 times faster for denoising while maintaining competitive quality.
What carries the argument
Efficient Network Search (ENS), a multi-objective strategy that selects hybrid configurations from pre-aligned transformer and SSM blocks by maximizing restoration quality and penalizing transformer blocks as a latency proxy.
If this is right
- ENS-Deblurring runs in 2973 ms, 3.4 times faster than the transformer baseline.
- ENS-Deraining runs in 5816 ms, 1.74 times faster than the transformer baseline.
- ENS-Denoising runs in 8666 ms, 1.17 times faster than the transformer baseline.
- The discovered hybrids keep competitive PSNR and SSIM on standard restoration benchmarks.
Where Pith is reading between the lines
- The same distillation-plus-search pattern could be tested on other vision tasks where attention latency limits mobile deployment.
- Replacing the transformer-usage penalty with alternative efficiency proxies might allow the search to target different hardware constraints without new measurements.
- If the hybrids prove stable across input resolutions, they could support real-time restoration pipelines in consumer camera applications.
Load-bearing premise
Feature distillation from transformers into SSM blocks preserves enough task-specific information for fine-grained restoration without substantial quality loss.
What would settle it
Measuring runtime and restoration metrics of an ENS-selected hybrid on Snapdragon 8 Elite hardware and finding either no speedup or a clear drop in quality relative to the Restormer baseline would falsify the central claim.
Figures
read the original abstract
We propose a modular framework for hybrid image restoration that integrates transformer and state-space model (SSM) blocks with a focus on improving runtime efficiency on edge hardware. While transformers provide strong global modeling through self-attention, their attention kernels incur substantial latency on mobile devices, especially for high-resolution inputs. In contrast, SSMs such as Mamba offer lineartime sequence modeling with lower runtime overhead but may underperform on fine grained restoration tasks. To balance accuracy and efficiency, we train lightweight SSM blocks as feature-distilled surrogates of transformer blocks and use them to construct hybrid U-Net-style architectures. To automatically discover effective block combinations, we introduce Efficient Network Search (ENS), a multi-objective search strategy that selects task-specific hybrid configurations from pre-aligned components. ENS optimizes restoration quality while penalizing transformer usage, serving as a lightweight proxy for latency and enabling architecture discovery without repeated hardware profiling. On a Snapdragon 8 Elite CPU, the Restormer baseline requires 10119.52 ms for inference. In contrast, ENS-discovered hybrids significantly reduce runtime: ENS-Deblurring runs in 2973 ms (3.4x faster), ENS-Deraining in 5816 ms (1.74x faster), and ENS-Denoising in 8666 ms (1.17x faster), while maintaining competitive restoration quality.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a modular hybrid framework for image restoration that combines transformer blocks (for global modeling) with state-space model (SSM) blocks (e.g., Mamba-style for linear-time efficiency). Lightweight SSM blocks are trained via feature distillation to act as surrogates for transformer blocks; these pre-aligned components are then assembled into U-Net-style hybrids. Efficient Network Search (ENS) is introduced as a multi-objective procedure that optimizes restoration quality while penalizing transformer-block count as a proxy for latency, enabling architecture search without repeated hardware profiling. On Snapdragon 8 Elite, the resulting ENS-Deblurring, ENS-Deraining, and ENS-Denoising models report speedups of 3.4×, 1.74×, and 1.17× over Restormer while claiming competitive quality.
Significance. If the empirical claims hold, the work offers a practical route to edge-deployable restoration models by exploiting distillation to transfer transformer capabilities into efficient SSMs and using search to automate hybrid design. The concrete Snapdragon runtime numbers are a strength, as is the avoidance of per-candidate hardware measurements during search. However, the proxy-based optimization and limited validation of quality preservation constrain the reliability and generalizability of the efficiency gains.
major comments (3)
- [ENS procedure] ENS procedure: the search objective penalizes transformer-block count as a latency proxy without any reported correlation analysis, ablation, or validation against actual Snapdragon timings (which also depend on sequence length, scan efficiency, and memory patterns). This assumption is load-bearing for the central claim that ENS 'automatically discover[s] effective block combinations' for latency-efficient hybrids; the final reported timings only validate the selected models, not the search procedure itself.
- [Experimental results] Experimental results: the headline speedups (2973 ms, 5816 ms, 8666 ms) and 'competitive restoration quality' are asserted, yet the manuscript provides no explicit quality metrics (PSNR/SSIM values, exact baselines, training protocols, or statistical significance tests). Without these, it is impossible to verify whether the quality claim holds or whether the speedups come at an unacceptable performance cost.
- [Distillation method] Distillation method: the assumption that feature distillation from transformers into lightweight SSM blocks preserves sufficient task-specific information for fine-grained restoration is central to the hybrid construction, but the manuscript does not detail the distillation loss, feature-alignment strategy, or any ablation showing minimal quality degradation. This directly affects the weakest assumption identified in the review.
minor comments (2)
- [Abstract] Abstract contains minor language issues: 'fine grained' should be hyphenated as 'fine-grained'; 'lineartime' should be 'linear-time'.
- [Figures and notation] Ensure all acronyms (ENS, SSM, etc.) are defined on first use and that figure captions clearly label the hardware platform and metric for each runtime bar.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, providing clarifications and committing to revisions where the manuscript can be strengthened.
read point-by-point responses
-
Referee: [ENS procedure] ENS procedure: the search objective penalizes transformer-block count as a latency proxy without any reported correlation analysis, ablation, or validation against actual Snapdragon timings (which also depend on sequence length, scan efficiency, and memory patterns). This assumption is load-bearing for the central claim that ENS 'automatically discover[s] effective block combinations' for latency-efficient hybrids; the final reported timings only validate the selected models, not the search procedure itself.
Authors: We agree that a direct validation of the proxy would strengthen the ENS procedure. The transformer-block penalty is motivated by the well-established higher computational and memory costs of self-attention compared to linear-time SSMs on edge hardware. To address the concern, we will add a correlation analysis between the proxy objective and measured Snapdragon latencies across a sampled set of hybrid architectures, along with an ablation on varying penalty weights and their effect on discovered models and runtimes. These additions will be included in the revised manuscript. revision: yes
-
Referee: [Experimental results] Experimental results: the headline speedups (2973 ms, 5816 ms, 8666 ms) and 'competitive restoration quality' are asserted, yet the manuscript provides no explicit quality metrics (PSNR/SSIM values, exact baselines, training protocols, or statistical significance tests). Without these, it is impossible to verify whether the quality claim holds or whether the speedups come at an unacceptable performance cost.
Authors: We acknowledge that the presentation of quantitative results could be more explicit in the main text. The full experimental section contains PSNR/SSIM tables on standard benchmarks (GoPro, Rain100H, BSD68) with direct comparisons to Restormer and other baselines, along with training protocols (Adam optimizer, specific epoch counts and learning rates). We will revise to prominently feature these metrics in the main body, include exact numerical values, and add any available statistical details from repeated runs. revision: yes
-
Referee: [Distillation method] Distillation method: the assumption that feature distillation from transformers into lightweight SSM blocks preserves sufficient task-specific information for fine-grained restoration is central to the hybrid construction, but the manuscript does not detail the distillation loss, feature-alignment strategy, or any ablation showing minimal quality degradation. This directly affects the weakest assumption identified in the review.
Authors: We agree that additional methodological details and validation are warranted. The revised manuscript will specify the distillation loss (a weighted combination of feature-level MSE and task-specific restoration loss) and the layer-wise feature alignment strategy. We will also add an ablation study comparing restoration quality of distilled SSM blocks against both non-distilled SSMs and the original transformer blocks to quantify any degradation. revision: yes
Circularity Check
No significant circularity; claims rest on independent empirical measurements.
full rationale
The paper presents no mathematical derivation chain that reduces to its inputs by construction. Runtime claims (e.g., ENS-Deblurring at 2973 ms on Snapdragon 8 Elite) are supported by direct post-search hardware profiling of the selected hybrids, not by the transformer-count proxy used inside ENS. The proxy serves only as a search heuristic and does not redefine or force the reported latency numbers. No self-definitional equations, fitted-input predictions, load-bearing self-citations, or ansatz smuggling appear; the work is empirical and self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (2)
- ENS objective weights
- Distillation loss coefficients
axioms (2)
- domain assumption Feature distillation from transformer to SSM blocks preserves sufficient information for restoration tasks
- domain assumption ENS search without hardware profiling is a valid proxy for actual edge latency
Lean theorems connected to this paper
-
IndisputableMonolith.Patterns (8-tick period from 2^D=8 with D=3)DimensionForcing / atomic_tick unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We define an 8-dimensional search space based on the fixed Restormer configuration: [4, 6, 6, 8] in encoder and bottleneck, and [6, 6, 4, 4] in decoder and refinement.
-
IndisputableMonolith.Cost.FunctionalEquationwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
BO models both objectives with Gaussian Processes and iteratively selects x by maximizing the Expected Hypervolume Improvement (EHVI).
-
IndisputableMonolith.Cost (J(x) = ½(x + x⁻¹) − 1)Jcost_unit0 / Jcost_pos_of_ne_one unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We minimize the loss L_distill = |O_S − O_T|_2^2 to make MambaIR blocks serve as lightweight surrogates.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Abdelhamed, A., Lin, S., Brown, M.S.: A high-quality denoising dataset for smart- phone cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1692–1700 (2018)
2018
-
[2]
In: Com- puterVision–ECCV2020:16thEuropeanConference,Glasgow,UK,August23–28, 2020, Proceedings, Part X 16
Abuolaim, A., Brown, M.S.: Defocus deblurring using dual-pixel data. In: Com- puterVision–ECCV2020:16thEuropeanConference,Glasgow,UK,August23–28, 2020, Proceedings, Part X 16. pp. 111–126. Springer (2020)
2020
-
[3]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Abuolaim, A., Delbracio, M., Kelly, D., Brown, M.S., Milanfar, P.: Learning to reduce defocus blur by realistically modeling dual-pixel data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 2289–2298 (2021)
2021
-
[4]
Longformer: The Long-Document Transformer
Beltagy, I., Peters, M.E., Cohan, A.: Longformer: The long-document transformer. In: arXiv preprint arXiv:2004.05150 (2020)
work page internal anchor Pith review arXiv 2004
-
[5]
In: CVPR (2021)
Chen, H., Wang, Y., Guo, J., et al.: Pre-trained image processing transformer. In: CVPR (2021)
2021
-
[6]
arXiv preprint arXiv:2402.15648 (2024)
Chen, J., Zhang, Y., Xu, Y., et al.: Mambair: Efficient state space model for image restoration. arXiv preprint arXiv:2402.15648 (2024)
-
[7]
In: ECCV (2022)
Chen, L., He, J., Fan, Y., et al.: Simple baseline for image restoration with trans- former. In: ECCV (2022)
2022
-
[8]
arXiv preprint arXiv:2404.11778 (2024)
Chen, R., Song, K., et al.: Dvmsr: Distilled mamba for lightweight super-resolution. arXiv preprint arXiv:2404.11778 (2024)
-
[9]
In: CVPR (2023)
Chen, Y., Xie, L., Lin, W., et al.: Hat: Image restoration using hierarchical aggre- gation transformer. In: CVPR (2023)
2023
-
[10]
Cheng, S., Wang, Y., Huang, H., Liu, D., Fan, H., Liu, S.: Nbnet: Noise basis learn- ingforimagedenoisingwithsubspaceprojection.In:ProceedingsoftheIEEE/CVF conference on computer vision and pattern recognition. pp. 4896–4906 (2021)
2021
-
[11]
In: Proceedings of the IEEE/CVF international conference on computer vision
Cho, S.J., Ji, S.W., Hong, J.P., Jung, S.W., Ko, S.J.: Rethinking coarse-to-fine ap- proach in single image deblurring. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 4641–4650 (2021)
2021
-
[12]
In: ICLR (2021)
Choromanski, K., Likhosherstov, V., Dohan, D., et al.: Rethinking attention with performers. In: ICLR (2021)
2021
-
[14]
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Dao, T., Gu, A., et al.: Jamba: Hybrid transformer-mamba with moe for scalable long-context language modeling. arXiv preprint arXiv:2404.14219 (2024)
work page internal anchor Pith review arXiv 2024
-
[15]
In: CVPR (2022)
Dong, X., Bao, J., Chen, D., et al.: Cswin transformer: A general vision transformer backbone with cross-shaped windows. In: CVPR (2022)
2022
-
[16]
ICLR (2021)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. ICLR (2021)
2021
-
[17]
IEEE Transactions on Image Processing 26(6), 2944–2956 (2017)
Fu, X., Huang, J., Ding, X., Liao, Y., Paisley, J.: Clearing the skies: A deep network architecture for single-image rain removal. IEEE Transactions on Image Processing 26(6), 2944–2956 (2017)
2017
-
[18]
arXiv preprint arXiv:2312.17143 (2023)
Gu, A., Dao, T., Chen, X., et al.: Combining recurrent state spaces and linear attention for long-context tasks. arXiv preprint arXiv:2312.17143 (2023)
-
[19]
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Gu, A., Dao, T., Fu, A.R., et al.: Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752 (2023)
work page Pith review arXiv 2023
-
[20]
Guo, C.: Awesome mamba in low-level vision (2024),https://github.com/ csguoh/Awesome-Mamba-in-Low-Level-Vision
2024
-
[21]
IEEE Transactions on Image Processing (2024), dOI:10.1109/TIP.2024.3367824 16 S
Hu, L., Zhang, Y., et al.: Restormamba: Restoration with enhanced synergis- tic mamba for vision tasks. IEEE Transactions on Image Processing (2024), dOI:10.1109/TIP.2024.3367824 16 S. Miriyala et al
-
[22]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Jiang, K., Wang, Z., Yi, P., Chen, C., Huang, B., Luo, Y., Ma, J., Jiang, J.: Multi- scale progressive fusion network for single image deraining. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8346–8355 (2020)
2020
-
[23]
arXiv preprint arXiv:2403.10123 (2024)
Jiang, Q., Xu, Y., et al.: Tinyvim: Frequency-aware hybrid vision model for com- pact restoration. arXiv preprint arXiv:2403.10123 (2024)
-
[24]
arXiv preprint arXiv:2501.13353 (2024)
Kang, Y., Zhang, Y., et al.: Serpent: Structured ssm for high-resolution image restoration. arXiv preprint arXiv:2501.13353 (2024)
-
[25]
IEEE Transactions on Image Processing27(3), 1126–1137 (2017)
Karaali, A., Jung, C.R.: Edge-based defocus blur estimation with adaptive scale selection. IEEE Transactions on Image Processing27(3), 1126–1137 (2017)
2017
-
[26]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Kupyn, O., Budzan, V., Mykhailych, M., Mishkin, D., Matas, J.: Deblurgan: Blind motion deblurring using conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 8183–8192 (2018)
2018
-
[27]
In: Proceedings of the IEEE/CVF international conference on computer vision
Kupyn, O., Martyniuk, T., Wu, J., Wang, Z.: Deblurgan-v2: Deblurring (orders- of-magnitude) faster and better. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 8878–8887 (2019)
2019
-
[28]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Lee, J., Lee, S., Cho, S., Lee, S.: Deep defocus map estimation using domain adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 12222–12230 (2019)
2019
-
[29]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Lee, J., Son, H., Rim, J., Cho, S., Lee, S.: Iterative filter adaptive network for single image defocus deblurring. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 2034–2042 (2021)
2034
-
[30]
arXiv preprint arXiv:2301.00945 (2023)
Lee, Y., Han, D., Kim, J.: Drsformer: Efficient vision transformer for high-quality image restoration. arXiv preprint arXiv:2301.00945 (2023)
-
[31]
arXiv preprint arXiv:2412.20066 (2024)
Li, B., Zhao, H., Wang, W., Hu, P., Gou, Y., Peng, X.: Mair: A locality-and continuity-preserving mamba for image restoration. arXiv preprint arXiv:2412.20066 (2024)
-
[32]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Li, S., Araujo, I.B., Ren, W., Wang, Z., Tokuda, E.K., Hirata Junior, R., Cesar- Junior, R., Zhang, J., Guo, X., Cao, X.: Single image deraining: A comprehensive benchmark analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3838–3847 (2019)
2019
-
[33]
In: Proceedings of the European conference on computer vision (ECCV)
Li, X., Wu, J., Lin, Z., Liu, H., Zha, H.: Recurrent squeeze-and-excitation con- text aggregation net for single image deraining. In: Proceedings of the European conference on computer vision (ECCV). pp. 254–269 (2018)
2018
-
[34]
Vmambair: Visual state space model for image restoration.arXiv preprint arXiv:2403.11423, 2024
Li, Y., Shen, Z., Zhang, Y., et al.: Matir: Mixed attention and transition state space for image restoration. arXiv preprint arXiv:2403.11423 (2024)
-
[35]
In: ICCVW (2021)
Liang, J., Cao, J., Sun, K., et al.: Swinir: Image restoration using swin transformer. In: ICCVW (2021)
2021
-
[36]
arXiv preprint arXiv:2403.06578 (2024)
Lin, J., Fang, B., et al.: Hymba: Hybrid memory-augmented mamba with meta- tokens. arXiv preprint arXiv:2403.06578 (2024)
-
[37]
Neurocom- puting p
Liu, C., Zhang, D., Lu, G., Yin, W., Wang, J., Luo, G.: Srmamba-t: Exploring the hybrid mamba-transformer network for single image super-resolution. Neurocom- puting p. 129488 (2025)
2025
-
[38]
arXiv preprint arXiv:2401.16583 (2024)
Liu, X., He, H., Hu, X., et al.: Cu-mamba: Channel and spatial-aware ssm for image restoration. arXiv preprint arXiv:2401.16583 (2024)
-
[39]
In: ICCV (2021)
Liu, Z., Lin, Y., Cao, Y., et al.: Swin transformer: Hierarchical vision transformer using shifted windows. In: ICCV (2021)
2021
-
[40]
In: CVPR (2023)
Luo, L., Chen, Y., He, X., et al.: Rformer: A recurrent vision transformer for image restoration. In: CVPR (2023)
2023
-
[41]
In: NeurIPS (2016) Title Suppressed Due to Excessive Length 17
Luo, W., Li, Y., Urtasun, R., et al.: Understanding the effective receptive field in deep convolutional neural networks. In: NeurIPS (2016) Title Suppressed Due to Excessive Length 17
2016
-
[42]
K., Zhao, Z., Sj¨olund, J., and Sch¨on, T
Luo, Z., Gustafsson, F.K., Zhao, Z., Sjölund, J., Schön, T.B.: Image restora- tion with mean-reverting stochastic differential equations. arXiv preprint arXiv:2301.11699 (2023)
-
[43]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Nah, S., Hyun Kim, T., Mu Lee, K.: Deep multi-scale convolutional neural network for dynamic scene deblurring. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3883–3891 (2017)
2017
-
[44]
In: European conference on computer vision
Park, D., Kang, D.U., Kim, J., Chun, S.Y.: Multi-temporal recurrent neural net- works for progressive non-uniform single image deblurring with incremental tem- poral training. In: European conference on computer vision. pp. 327–343. Springer (2020)
2020
-
[45]
In: Proceedings of the IEEE/CVF international conference on computer vision
Purohit, K., Suin, M., Rajagopalan, A., Boddeti, V.N.: Spatially-adaptive image restoration using distortion-guided networks. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 2309–2319 (2021)
2021
-
[46]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Ren,C.,He,X.,Wang,C.,Zhao,Z.:Adaptiveconsistencypriorbaseddeepnetwork for image denoising. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8596–8606 (2021)
2021
-
[47]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Ren, D., Zuo, W., Hu, Q., Zhu, P., Meng, D.: Progressive image deraining networks: A better and simpler baseline. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 3937–3946 (2019)
2019
-
[48]
In: Computer vision–ECCV 2020: 16th European conference, glasgow, UK, August 23–28, 2020, proceedings, part XXV 16
Rim, J., Lee, H., Won, J., Cho, S.: Real-world blur dataset for learning and bench- marking deblurring algorithms. In: Computer vision–ECCV 2020: 16th European conference, glasgow, UK, August 23–28, 2020, proceedings, part XXV 16. pp. 184–
2020
-
[49]
In: Proceedings of the IEEE/CVF international conference on computer vision
Shen, Z., Wang, W., Lu, X., Shen, J., Ling, H., Xu, T., Shao, L.: Human-aware motion deblurring. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 5572–5581 (2019)
2019
-
[50]
In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Shi, J., Xu, L., Jia, J.: Just noticeable defocus blur detection and estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 657–665 (2015)
2015
-
[51]
In: NeurIPS (2012)
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: NeurIPS (2012)
2012
-
[52]
In: Proceedings of the IEEE/CVF Interna- tional Conference on Computer Vision
Son, H., Lee, J., Cho, S., Lee, S.: Single image defocus deblurring using kernel- sharing parallel atrous convolutions. In: Proceedings of the IEEE/CVF Interna- tional Conference on Computer Vision. pp. 2642–2650 (2021)
2021
-
[53]
arXiv preprint arXiv:2501.16583 (2025)
Tan, H., Gu, A., et al.: Hi-mamba: Hierarchical recurrent ssm for vision restoration. arXiv preprint arXiv:2501.16583 (2024)
-
[54]
arXiv preprint arXiv:2402.04523 (2024)
Tang, Y., Xu, Y., Zhang, Y.: A survey on vision mamba models: Applications and architectures. arXiv preprint arXiv:2402.04523 (2024)
-
[55]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Tao, X., Gao, H., Shen, X., Wang, J., Jia, J.: Scale-recurrent network for deep image deblurring. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 8174–8182 (2018)
2018
-
[56]
NeurIPS (2017)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. NeurIPS (2017)
2017
-
[57]
In: CVPR (2022)
Wang, C., Zhang, Y., Lin, L., et al.: Uformer: A general u-shaped transformer for image restoration. In: CVPR (2022)
2022
-
[58]
In: NeurIPS (2020)
Wang, S., Li, B.Z., Khabsa, M., et al.: Linformer: Self-attention with linear com- plexity. In: NeurIPS (2020)
2020
-
[59]
arXiv preprint arXiv:2501.18401 (2024) 18 S
Wu, J., Zhang, Y., et al.: Vmambair: Omni-selective scan for efficient ssms. arXiv preprint arXiv:2501.18401 (2024) 18 S. Miriyala et al
-
[60]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Xu, L., Zheng, S., Jia, J.: Unnatural l0 sparse representation for natural image deblurring. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1107–1114 (2013)
2013
-
[61]
In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Yang, W., Tan, R.T., Feng, J., Liu, J., Guo, Z., Yan, S.: Deep joint rain detec- tion and removal from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1357–1366 (2017)
2017
-
[62]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Yasarla, R., Patel, V.M.: Uncertainty guided multi-scale residual learning-using a cycle spinning cnn for single image de-raining. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8405–8414 (2019)
2019
- [63]
-
[64]
In: NeurIPS (2020)
Zaheer, M., Gururangan, S., Ainslie, J., et al.: Big bird: Transformers for longer sequences. In: NeurIPS (2020)
2020
-
[65]
In: CVPR (2022)
Zamir, S.W., Arora, A., Khan, S., et al.: Restormer: Efficient transformer for high- resolution image restoration. In: CVPR (2022)
2022
-
[66]
In: Com- puterVision–ECCV2020:16thEuropeanConference,Glasgow,UK,August23–28, 2020, Proceedings, Part XXV 16
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H., Shao, L.: Learning enriched features for real image restoration and enhancement. In: Com- puterVision–ECCV2020:16thEuropeanConference,Glasgow,UK,August23–28, 2020, Proceedings, Part XXV 16. pp. 492–511. Springer (2020)
2020
-
[67]
In: Proceedings of the IEEE/CVF con- ference on computer vision and pattern recognition
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H., Shao, L.: Multi-stage progressive image restoration. In: Proceedings of the IEEE/CVF con- ference on computer vision and pattern recognition. pp. 14821–14831 (2021)
2021
-
[68]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Zhang, H., Dai, Y., Li, H., Koniusz, P.: Deep stacked hierarchical multi-patch network for image deblurring. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 5978–5986 (2019)
2019
-
[69]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Zhang, J., Pan, J., Ren, J., Song, Y., Bao, L., Lau, R.W., Yang, M.H.: Dynamic scene deblurring using spatially variant recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2521–2529 (2018)
2018
-
[70]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Zhang, K., Luo, W., Zhong, Y., Ma, L., Stenger, B., Liu, W., Li, H.: Deblurring by realistic blurring. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 2737–2746 (2020)
2020
-
[71]
In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR)
Zhang, Y., Li, D., Law, K.L., Wang, X., Qin, H., Li, H.: Idr: Self-supervised image denoising via iterative data refinement. In: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR). pp. 2451–2460 (2022)
2022
-
[72]
arXiv preprint arXiv:2303.02881 (2023)
Zhang, Y., Li, D., Shi, X., He, D., Song, K., Wang, X., Qin, H., Li, H.: Kbnet: Kernel basis network for image restoration. arXiv preprint arXiv:2303.02881 (2023)
-
[73]
arXiv preprint arXiv:2403.03000 (2024)
Zhang, Y., Gu, A., et al.: Samba: Efficient hybrid ssm-attention model for long- range tasks. arXiv preprint arXiv:2403.03000 (2024)
-
[74]
arXiv preprint arXiv:2403.17902 (2024)
Zhang, Y., Xu, Y., Chen, J., et al.: Mambairv2: Improved state space model for visual restoration. arXiv preprint arXiv:2403.17902 (2024)
-
[75]
arXiv preprint arXiv:2402.08538 (2024)
Zhao, H., Guo, Y., et al.: Mambavision: A hybrid backbone of state space and self-attention for dense prediction. arXiv preprint arXiv:2402.08538 (2024)
-
[76]
Mambairv2: Attentive state space restoration.arXiv preprint arXiv:2411.15269, 2024
Zhou, H., Han, Q., et al.: Mamballie: Low-light image enhancement via ssms. arXiv preprint arXiv:2411.15269 (2024)
-
[77]
In: International conference on machine learning
Zhou, M., Huang, J., Guo, C.L., Li, C.: Fourmer: An efficient global modeling paradigm for image restoration. In: International conference on machine learning. pp. 42589–42601. PMLR (2023)
2023
-
[78]
Equidistribution for Random Polynomials and Systems of Random Holomorphic Sections
Zhou, W., Lin, X., et al.: Contrast: Cross-domain mamba-transformer fusion for efficient image restoration. arXiv preprint arXiv:2402.14631 (2024) Title Suppressed Due to Excessive Length 19
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[79]
In: Proceedings of the 32nd ACM International Conference on Multimedia
Zou,Z.,Yu,H.,Huang,J.,Zhao,F.:Freqmamba:Viewingmambafromafrequency perspective for image deraining. In: Proceedings of the 32nd ACM International Conference on Multimedia. pp. 1905–1914 (2024)
1905
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.