RealRep: Generalized SDR-to-HDR Conversion via Attribute-Disentangled Representation Learning

Gang He; Kepeng Xu; Lin Zhang; Li Xu; Siqi Wang; Weiran Wang; Yu-Wing Tai

RealRep uses attribute-disentangled learning to generalize SDR-to-HDR conversion across real-world degradations.

Reviewed by Pith at T0; open to challenge. T0 means a machine referee read the full paper against a public rubric. the ladder, T0–T4 →

Challenge this review Re-run · record.json Download PDF Read on arXiv ↗

T0 review · grok-4.3

2026-05-22 15:50 UTC pith:EZYITWQR

load-bearing objection RealRep adds a disentangled luma-chroma approach with controlled mapping that targets real SDR degradations better than fixed tone operators, though the generalization edge still needs tighter validation. the 1 major comments →

arxiv 2505.07322 v4 pith:EZYITWQR submitted 2025-05-12 cs.CV

RealRep: Generalized SDR-to-HDR Conversion via Attribute-Disentangled Representation Learning

Li Xu , Siqi Wang , Kepeng Xu , Gang He , Lin Zhang , Weiran Wang , Yu-Wing Tai This is my paper

classification cs.CV

keywords SDR-to-HDR conversionattribute disentanglementrepresentation learningtone mappingcolor gamutdegradation awarenesscontrastive learningadaptive mapping

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

The pith

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes RealRep, a framework for converting Standard Dynamic Range images to High Dynamic Range that learns to separate luminance and chrominance attributes. This separation helps capture how different SDR content varies in appearance and quality. By generating contrastive pairs sensitive to these differences and using a controlled mapping network, the method adapts the conversion process to various degradations. The result is more consistent and accurate HDR reconstructions than previous fixed approaches. This matters because real SDR content comes from many sources with inconsistent quality, making a one-size-fits-all tone mapping insufficient.

Core claim

The central discovery is that explicitly disentangling luminance and chrominance components through Realistic Attribute-Disentangled Representation Learning, combined with luma-chroma aware negative exemplars and a degradation-domain aware controlled mapping network, enables robust adaptive hierarchical mapping from diverse SDR inputs to perceptually faithful HDR outputs with wide color gamut.

What carries the argument

Realistic Attribute-Disentangled Representation Learning (RealRep) that disentangles luminance and chrominance to model intrinsic content variations, paired with the Degradation-Domain Aware Controlled Mapping Network (DDACMNet) that uses control-aware normalization for adaptive mapping.

Load-bearing premise

Disentangling luminance and chrominance components captures enough intrinsic variations in SDR content to enable reliable adaptive mapping.

What would settle it

A counterexample would be a set of real-world SDR images with novel degradation types where the method produces color shifts or loses detail in HDR output compared to baselines.

Watch this falsifier — get emailed when new claim-graph text bears on it.

If this is right

Adaptive mapping handles diverse SDR styles and degradations better than fixed operators.
Contrastive learning with degradation-sensitive pairs improves modeling of tone discrepancies.
Two-stage framework allows hierarchical adaptation guided by degradation features.
Improved generalization leads to perceptually faithful color gamut reconstruction across distributions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar disentanglement could extend to video SDR-to-HDR for temporal consistency.
Applying this to other image enhancement tasks like low-light or super-resolution might benefit from attribute separation.
Testing on synthetic degradations could validate the robustness claims further.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit.

Desk Editor's Note

RealRep adds a disentangled luma-chroma approach with controlled mapping that targets real SDR degradations better than fixed tone operators, though the generalization edge still needs tighter validation.

read the letter

The main point is that this paper builds a framework called RealRep to convert SDR to HDR by explicitly separating luminance and chrominance attributes, then feeding those into a degradation-aware network for adaptive mapping. That combination is the actual new piece compared to the fixed operators they cite as baselines. They add luma- and chroma-aware negative exemplar generation to create contrastive pairs that model tone differences across SDR styles, and wrap it in DDACMNet, a lightweight two-stage setup that uses control-aware normalization to modulate the output based on detected degradations. This setup directly tackles the practical issue of varied real-world SDR content like compression artifacts or gamut shifts, which fixed methods handle poorly. The abstract frames the experiments as showing consistent gains in generalization and perceptual fidelity, and the approach feels grounded in addressing distribution shifts rather than just fitting more parameters. On the soft spots, the stress-test concern about whether the generated negatives span enough unseen degradations holds some weight here. If the augmentations stay mostly within simulated tone and noise variations, the attribute priors might not steer the mapping correctly on inputs with heavy sensor clipping or compression not seen in training. That could weaken the robustness half of the claim, even if the in-distribution results look fine. The paper does not appear to lean on circular definitions or unfalsifiable fitting, which helps. This work is aimed at computer vision researchers and media production teams dealing with HDR display pipelines. A reader focused on practical conversion tools would find the framework and the specific negative sampling strategy useful to build on. It deserves a serious referee because the technical choices are explicit and the problem is timely, even if the experiments will likely need more out-of-distribution testing and clearer ablation breakdowns during review.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes RealRep, a generalized SDR-to-HDR conversion framework. It introduces Realistic Attribute-Disentangled Representation Learning to explicitly disentangle luminance and chrominance components for capturing intrinsic content variations across SDR distributions, a Luma-/Chroma-aware negative exemplar generation strategy to construct degradation-sensitive contrastive pairs, and the Degradation-Domain Aware Controlled Mapping Network (DDACMNet), a lightweight two-stage architecture that performs adaptive hierarchical mapping via control-aware normalization conditioned on degradation features. The central claim is that this approach yields superior generalization and perceptually faithful HDR color gamut reconstruction compared to state-of-the-art methods, as supported by extensive experiments.

Significance. If the generalization results hold under rigorous out-of-distribution testing, the work would be significant for computer vision and multimedia processing by moving beyond fixed tone-mapping operators to handle diverse real-world SDR degradations. The explicit disentanglement of attributes combined with contrastive modeling of tone discrepancies offers a principled way to build degradation-aware priors, which could improve robustness in practical HDR-WCG pipelines. The lightweight design of DDACMNet is a practical strength if the performance gains are reproducible.

major comments (1)

[Abstract and Experiments (§4)] The generalization claim (abstract and §4) rests on the assumption that attribute-disentangled priors from internally generated contrastive pairs transfer to arbitrary real-world SDR inputs. However, the Luma-/Chroma-aware negative exemplar generation is described as spanning simulated tone and noise variations within the training distribution; without explicit evaluation on out-of-distribution cases such as heavy compression artifacts or sensor-specific gamut clipping, the robustness of DDACMNet's control-aware normalization remains unverified and load-bearing for the headline result.

minor comments (2)

[Method (§3)] Clarify the exact formulation of the contrastive loss and how the degradation-conditioned features are injected into the normalization layers of DDACMNet; the current description leaves the control mechanism somewhat underspecified.
[Experiments (§4)] Ensure all quantitative tables report both mean and standard deviation across multiple runs or datasets to allow assessment of statistical significance of the reported outperformance.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our generalization claims. We address the major comment point-by-point below and will incorporate revisions to strengthen the evaluation of out-of-distribution robustness.

read point-by-point responses

Referee: [Abstract and Experiments (§4)] The generalization claim (abstract and §4) rests on the assumption that attribute-disentangled priors from internally generated contrastive pairs transfer to arbitrary real-world SDR inputs. However, the Luma-/Chroma-aware negative exemplar generation is described as spanning simulated tone and noise variations within the training distribution; without explicit evaluation on out-of-distribution cases such as heavy compression artifacts or sensor-specific gamut clipping, the robustness of DDACMNet's control-aware normalization remains unverified and load-bearing for the headline result.

Authors: We appreciate the referee's careful reading. The Luma-/Chroma-aware negative exemplar generation indeed relies on simulated tone and noise variations to construct contrastive pairs during training. However, the training data itself comprises diverse real-world SDR sources with natural degradations, and our test sets include real inputs exhibiting compression artifacts, noise, and gamut variations. The DDACMNet's control-aware normalization is explicitly conditioned on degradation features extracted from these real inputs, which enables the observed generalization. That said, we acknowledge that dedicated, isolated OOD benchmarks for heavy compression and sensor-specific gamut clipping are not separately reported. In the revision we will add such targeted evaluations (including quantitative metrics and qualitative results on held-out degradation types) to directly verify the robustness of the control mechanism. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical training and evaluation on contrastive pairs

full rationale

The paper proposes an empirical framework (RealRep for attribute-disentangled learning plus DDACMNet with control-aware normalization) trained on internally generated degradation-sensitive contrastive pairs. No equations, derivations, or predictions are shown that reduce by construction to fitted parameters or self-referential definitions. Claims rest on experimental comparisons rather than any self-citation chain, uniqueness theorem, or ansatz smuggling. The derivation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract provides insufficient detail to enumerate specific free parameters or invented entities; relies on standard deep learning assumptions for representation learning and contrastive training.

axioms (1)

domain assumption Disentangling luminance and chrominance captures intrinsic content variations across SDR distributions
Invoked as central to RealRep in the abstract description.

pith-pipeline@v0.9.0 · 5765 in / 1181 out tokens · 48120 ms · 2026-05-22T15:50:48.229692+00:00 · methodology

0 comments

read the original abstract

High-Dynamic-Range Wide-Color-Gamut (HDR-WCG) technology is becoming increasingly widespread, driving a growing need for converting Standard Dynamic Range (SDR) content to HDR. Existing methods primarily rely on fixed tone mapping operators, which struggle to handle the diverse appearances and degradations commonly present in real-world SDR content. To address this limitation, we propose a generalized SDR-to-HDR framework that enhances robustness by learning attribute-disentangled representations. Central to our approach is Realistic Attribute-Disentangled Representation Learning (RealRep), which explicitly disentangles luminance and chrominance components to capture intrinsic content variations across different SDR distributions. Furthermore, we design a Luma-/Chroma-aware negative exemplar generation strategy that constructs degradation-sensitive contrastive pairs, effectively modeling tone discrepancies across SDR styles. Building on these attribute-level priors, we introduce the Degradation-Domain Aware Controlled Mapping Network (DDACMNet), a lightweight, two-stage framework that performs adaptive hierarchical mapping guided by a control-aware normalization mechanism. DDACMNet dynamically modulates the mapping process via degradation-conditioned features, enabling robust adaptation across diverse degradation domains. Extensive experiments demonstrate that RealRep consistently outperforms state-of-the-art methods in both generalization and perceptually faithful HDR color gamut reconstruction.

RealRep: Generalized SDR-to-HDR Conversion via Attribute-Disentangled Representation Learning

Core claim

What carries the argument

Load-bearing premise

What would settle it

If this is right

Where Pith is reading between the lines

discussion (0)