Style-Decoupled Adaptive Routing Network for Underwater Image Enhancement
Pith reviewed 2026-05-10 14:46 UTC · model grok-4.3
The pith
SDAR-Net separates degradation style from scene structure in underwater images and routes enhancement adaptively for each input.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
SDAR-Net formulates input features into dynamic degradation style embeddings and static scene structural representations. It then applies an adaptive routing mechanism that evaluates the style embeddings to predict soft weights at different enhancement states and uses those weights to guide the fusion of the corresponding representations, satisfying the restoration needs of each image individually.
What carries the argument
The adaptive routing mechanism that evaluates style embeddings to predict soft weights for fusing dynamic degradation and static structural representations.
If this is right
- Reaches 25.72 dB PSNR on real-world underwater benchmarks, exceeding prior methods.
- Delivers appropriate enhancement levels instead of over-processing mild degradations or under-recovering severe ones.
- Improves accuracy in downstream tasks such as object detection and segmentation on underwater data.
- Allows one network to handle the full spectrum of degradation without dataset-specific retraining.
Where Pith is reading between the lines
- The same style-structure split might apply to other appearance-only degradations such as fog or low light.
- Storing reusable structural representations could lower compute when processing image sequences from the same location.
- Embedding the router in an underwater robot could enable on-the-fly adjustment without storing multiple enhancement models.
Load-bearing premise
Underwater degradation primarily shifts appearance while leaving the underlying scene structure unchanged, so the two can be cleanly separated.
What would settle it
On paired images of the identical scene captured under increasing degradation levels, the extracted structural representations would change noticeably or the adaptive method would show no gain over a uniform baseline.
Figures
read the original abstract
Underwater Image Enhancement (UIE) is essential for robust visual perception in marine applications. However, existing methods predominantly rely on uniform mapping tailored to average dataset distributions, leading to over-processing mildly degraded images or insufficient recovery for severe ones. To address this challenge, we propose a novel adaptive enhancement framework, SDAR-Net. Unlike existing uniform paradigms, it first decouples specific degradation styles from the input and subsequently modulates the enhancement process adaptively. Specifically, since underwater degradation primarily shifts the appearance while keeping the scene structure, SDAR-Net formulates image features into dynamic degradation style embeddings and static scene structural representations through a carefully designed training framework. Subsequently, we introduce an adaptive routing mechanism. By evaluating style features and adaptively predicting soft weights at different enhancement states, it guides the weighted fusion of the corresponding image representations, accurately satisfying the adaptive restoration demands of each image. Extensive experiments show that SDAR-Net achieves a new state-of-the-art (SOTA) performance with a PSNR of 25.72 dB on real-world benchmark, and demonstrates its utility in downstream vision tasks. Our code is available at https://github.com/WHU-USI3DV/SDAR-Net.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes SDAR-Net, an adaptive framework for underwater image enhancement that first decouples input features into dynamic degradation style embeddings and static scene structural representations (based on the premise that degradation primarily affects appearance while preserving structure), then applies an adaptive routing mechanism to predict soft weights and fuse representations for per-image enhancement. It reports achieving SOTA performance with a PSNR of 25.72 dB on real-world benchmarks and improved results on downstream vision tasks, with code released.
Significance. If the decoupling premise and empirical gains hold under rigorous validation, the work could meaningfully advance UIE by replacing uniform mappings with style-aware adaptive routing, improving robustness across varying degradation severities in marine applications. The provision of code is a positive for reproducibility.
major comments (2)
- Abstract and §1: The load-bearing premise that 'underwater degradation primarily shifts the appearance while keeping the scene structure' is asserted without supporting evidence such as feature invariance metrics, controlled same-scene multi-degradation experiments, or visualizations showing that structural representations remain degradation-free; this directly underpins the style embeddings, static representations, and subsequent adaptive routing, so its unverified status risks rendering the fusion ineffective if scattering/absorption also degrades edges/contrast in a content-dependent manner.
- Experiments section: The central SOTA claim (PSNR 25.72 dB on real-world benchmark) and downstream gains are presented without reported details on training/validation splits, exact baseline re-implementations, statistical significance tests, or ablation studies isolating the style-decoupling and routing components; this absence makes it impossible to assess whether the performance edge is robust or attributable to the proposed architecture.
minor comments (1)
- Notation: The distinction between 'dynamic degradation style embeddings' and 'static scene structural representations' is introduced without a formal definition or diagram clarifying how they are extracted in the training framework.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. The comments have identified important areas where additional evidence and reporting will strengthen the presentation of our work. We address each major comment below and describe the revisions we will implement.
read point-by-point responses
-
Referee: [—] Abstract and §1: The load-bearing premise that 'underwater degradation primarily shifts the appearance while keeping the scene structure' is asserted without supporting evidence such as feature invariance metrics, controlled same-scene multi-degradation experiments, or visualizations showing that structural representations remain degradation-free; this directly underpins the style embeddings, static representations, and subsequent adaptive routing, so its unverified status risks rendering the fusion ineffective if scattering/absorption also degrades edges/contrast in a content-dependent manner.
Authors: We appreciate the referee highlighting the need for explicit validation of this premise. While the assumption aligns with the physical model of underwater imaging (absorption and scattering primarily alter color and contrast rather than geometric structure), we agree that empirical support is currently insufficient. In the revised manuscript, we will add: (1) quantitative feature invariance metrics (e.g., similarity scores of structural embeddings under simulated degradation), (2) controlled experiments using same-scene synthetic pairs with varying degradation levels, and (3) visualizations demonstrating preservation of edges and semantic structure. These will be incorporated into the introduction and experiments sections to better substantiate the decoupling strategy. revision: yes
-
Referee: [—] Experiments section: The central SOTA claim (PSNR 25.72 dB on real-world benchmark) and downstream gains are presented without reported details on training/validation splits, exact baseline re-implementations, statistical significance tests, or ablation studies isolating the style-decoupling and routing components; this absence makes it impossible to assess whether the performance edge is robust or attributable to the proposed architecture.
Authors: We acknowledge that the current experimental reporting lacks sufficient detail for full reproducibility and robustness assessment. In the revised manuscript, we will expand the Experiments section to include: explicit descriptions of all training/validation/test splits, precise re-implementation details for baselines (including hyperparameters and any modifications), statistical significance tests (e.g., p-values from multiple runs), and comprehensive ablation studies isolating the style-decoupling and adaptive routing components. Standard deviations will also be reported for key metrics. These additions will enable clearer evaluation of the claimed performance gains. revision: yes
Circularity Check
No circularity: empirical architecture with independent training and evaluation
full rationale
The paper proposes SDAR-Net as a neural network architecture that decouples degradation styles from scene structure under an explicit premise, then applies adaptive routing for fusion. All performance claims (e.g., 25.72 dB PSNR) are presented as outcomes of training on benchmarks and downstream task evaluation, not as quantities that reduce by the paper's own equations or definitions to fitted inputs. No load-bearing self-citations, uniqueness theorems, or ansatzes are invoked in the provided text to justify the core separation or routing; the framework is self-contained as a proposed model with a described training procedure. The decoupling premise is an assumption, not a derived result that loops back on itself.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Underwater degradation primarily shifts the appearance while keeping the scene structure
Reference graph
Works this paper leans on
-
[1]
IEEE Transactions on Image Processing 32, 4472–4485
Pugan: Physical model-guided underwater image enhancement using gan with dual-discriminators. IEEE Transactions on Image Processing 32, 4472–4485. Drews, P., Nascimento, E., Moraes, F., Botelho, S., Campos, M., 2013. Transmission estimation in underwater single images, in: Proceedings oftheIEEEinternationalconferenceoncomputervisionworkshops,pp. 825–830. ...
work page 2013
-
[2]
Underwater image restoration via polymorphic large kernel cnns, in: ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE. pp. 1–5. Han,J.,Shoeiby,M.,Malthus,T.,Botha,E.,Anstee,J.,Anwar,S.,Wei,R., Armin,M.A.,Li,H.,Petersson,L.,2022. Underwaterimagerestoration via contrastive learning and a real-world data...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.