pith. sign in

arxiv: 2605.13258 · v2 · pith:ZF52M5T2new · submitted 2026-05-13 · 💻 cs.CV · cs.AI

X-Restormer++: 1st Place Solution for the UG2+ CVPR 2026 All-Weather Restoration Challenge

Pith reviewed 2026-05-14 19:58 UTC · model grok-4.3

classification 💻 cs.CV cs.AI
keywords image restorationall-weather conditionstransformeredge-aware lossdata expansioncomputer vision challenge
0
0 comments X

The pith

X-Restormer++ wins first place in all-weather image restoration by adding adaptive scaling, edge-aware loss, and extra training pairs

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper describes a winning entry for restoring images degraded by rain, snow, fog and similar conditions. It starts from the X-Restormer baseline, which uses dual attention blocks to capture both global channel relations and local spatial structure. Three targeted additions are introduced: spatially adaptive scaling of the input to adjust weights per location, a Gradient-Guided Edge-Aware loss that supplements L1 and multi-scale SSIM to protect edges, and an expanded training set that adds 24,500 new degraded-clean pairs. These changes together produce the top-ranked model in the UG2+ CVPR 2026 challenge track.

Core claim

Extending the X-Restormer dual-attention backbone with spatially-adaptive input scaling, a Gradient-Guided Edge-Aware loss term, and a substantially larger training corpus of degraded-clean image pairs yields a model that achieves first place in the all-weather image restoration challenge.

What carries the argument

X-Restormer dual-attention blocks (Multi-DConv Head Transposed Attention and Overlapping Cross-Attention) augmented by spatially-adaptive input scaling and the composite loss that includes the Gradient-Guided Edge-Aware term.

If this is right

  • Adaptive input scaling improves handling of spatially varying weather effects.
  • Gradient guidance in the loss preserves edge and structural detail that standard losses tend to blur.
  • The larger dataset increases robustness across diverse weather types and scenes.
  • The combined loss balances pixel accuracy with perceptual and structural fidelity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar combinations of scaling, specialized loss, and data volume may transfer to other restoration problems such as denoising or dehazing.
  • The result suggests that benchmark success often depends on careful integration of existing components as much as on new architectural ideas.
  • Individual ablation of each addition on the official test set would clarify which change drives the largest share of the gain.

Load-bearing premise

The observed performance lift stems primarily from the three listed modifications rather than from undisclosed changes in training schedule, optimizer, or data filtering.

What would settle it

Train the original X-Restormer on the same expanded dataset with identical hyperparameters but without the adaptive scaling and Gradient-Guided Edge-Aware loss, then measure whether its challenge score remains below the reported winning result.

Figures

Figures reproduced from arXiv: 2605.13258 by Fengjie Zhu, Leilei Cao, Yingfang Zhu, Youwei Pan.

Figure 1
Figure 1. Figure 1: The overall framework of our proposed method. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Qualitative results on challenging scenes (Part 1). Each scene shows the degraded input and our restored output. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative results on challenging scenes (Part 2). Each scene shows the degraded input and our restored output. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
read the original abstract

In this work, we present our winning solution for the 8th UG2+ Challenge (CVPR 2026) Track 1: Image Restoration under All-weather Conditions. Our method is built upon the X-Restormer baseline, which captures both channel-wise global dependencies and spatially-local structural information through its dual-attention design (Multi-DConv Head Transposed Attention and Overlapping Cross-Attention), augmented with the spatially-adaptive input scaling mechanism from Restormer-Plus. We adopt a two-stage training strategy with dual-model ensemble inference. In the first stage, Model B is trained from scratch on a large-scale diverse dataset randomly sampled from the FoundIR training set (approximately 800 GB out of 4.84 TB), covering five degradation types: blur, haze, rain, snow, and composite conditions such as co-occurring rain and haze. In the second stage, Model A is fine-tuned on the WeatherStream dataset (rain and snow splits) using Model B's final checkpoint as pretrained initialization, enabling efficient domain adaptation with a substantially smaller dataset. To better preserve structural details during training, we propose a novel Gradient-Guided Edge-Aware (GGEA) Loss, which applies Sobel operators to the ground-truth image to construct a spatially adaptive weight map that assigns higher supervision to edge and high-frequency regions. This is incorporated alongside L1 and Multi-Scale SSIM losses in a unified training objective. At inference time, predictions from the two models are fused via a weighted average, out = 0.4 x outA + 0.6 x outB, where the higher weight assigned to Model B reflects its stronger generalization ability from large-scale pretraining. With these strategies, our proposed method successfully ranks 1st in the challenge.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper presents X-Restormer++ as the winning entry for the UG2+ CVPR 2026 All-Weather Restoration Challenge (Track 1). It starts from the X-Restormer baseline and adds three modifications: spatially-adaptive input scaling, a Gradient-Guided Edge-Aware (GGEA) loss combined with L1 and MS-SSIM, and an expanded training set of 24,500 additional pairs drawn from FoundIR and WeatherBench, reporting a first-place ranking.

Significance. If the ranking is reproducible and the modifications can be isolated, the work supplies a practical recipe for improving transformer-based restoration under combined weather degradations. The empirical outcome in a public challenge demonstrates that modest architectural and loss adjustments plus data scaling can yield competitive gains on all-weather benchmarks.

major comments (1)
  1. [Methods / Experiments] The central claim attributes the first-place ranking to the three listed changes, yet the manuscript contains no ablation tables that retrain the unmodified X-Restormer baseline on the identical expanded dataset, loss combination, optimizer schedule, and data-filtering protocol. Without these controls, the performance delta cannot be confidently ascribed to the proposed modifications rather than to unreported differences in training length or preprocessing.
minor comments (1)
  1. [Abstract / §3] The abstract and methods section refer to a 'unified training objective' but do not state the numerical weights used to combine the GGEA, L1, and MS-SSIM terms; these coefficients should be reported explicitly.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed review and constructive suggestion. We address the major comment on the lack of isolating ablations below and commit to strengthening the experimental section in revision.

read point-by-point responses
  1. Referee: The central claim attributes the first-place ranking to the three listed changes, yet the manuscript contains no ablation tables that retrain the unmodified X-Restormer baseline on the identical expanded dataset, loss combination, optimizer schedule, and data-filtering protocol. Without these controls, the performance delta cannot be confidently ascribed to the proposed modifications rather than to unreported differences in training length or preprocessing.

    Authors: We agree that the current manuscript does not contain the requested ablation tables that retrain the unmodified X-Restormer baseline under the exact same expanded dataset, loss combination, optimizer schedule, and data-filtering protocol. This omission makes it difficult to fully isolate the contribution of each proposed change from potential differences in training details. In the revised manuscript we will add a dedicated ablation section that includes these controls: (1) X-Restormer baseline retrained on the full 24,500-pair expanded set using the identical L1 + MS-SSIM + GGEA loss, optimizer, and schedule; (2) incremental addition of the spatially-adaptive scaling module; and (3) the full X-Restormer++ configuration. These new experiments will be run with the same data-filtering protocol and reported with the same metrics used in the challenge submission. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical challenge solution built on external baseline

full rationale

The manuscript is a competition report that extends the external X-Restormer baseline with three engineering changes (spatially-adaptive scaling, GGEA loss, and added training pairs from FoundIR/WeatherBench) and records a 1st-place ranking on the UG2+ leaderboard. No equations, predictions, or first-principles derivations appear; the central claim is an externally validated empirical outcome rather than a quantity that reduces to its own fitted inputs or self-citations by construction. Self-citations to prior Restormer variants are present but non-load-bearing, as the performance delta is measured against the public challenge metric and not derived from internal redefinitions.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim rests on the assumption that the baseline X-Restormer already captures the necessary dependencies and that the new loss and data additions produce additive gains without negative interactions.

free parameters (1)
  • loss combination weights
    Weights balancing L1, MS-SSIM, and GGEA losses are chosen during training and directly affect the reported performance.
axioms (1)
  • domain assumption Dual-attention design in X-Restormer effectively captures channel-wise global and spatially-local information
    Invoked when stating the baseline framework.
invented entities (1)
  • Gradient-Guided Edge-Aware (GGEA) loss no independent evidence
    purpose: To better preserve structural details and edge information during training
    New loss function introduced without independent validation outside the challenge results.

pith-pipeline@v0.9.0 · 5520 in / 1320 out tokens · 62494 ms · 2026-05-14T19:58:22.540307+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.