pith. sign in

arxiv: 2604.06276 · v1 · submitted 2026-04-07 · 📡 eess.IV · cs.CV

Structural Regularities of Cinema SDR-to-HDR Mapping in a Controlled Mastering Workflow: A Pixel-wise Case Study on ASC StEM2

Pith reviewed 2026-05-10 19:09 UTC · model grok-4.3

classification 📡 eess.IV cs.CV
keywords SDR-to-HDR mappingcinema mastering workflowpixel-wise analysisluminance correspondencecolor saturation redistributionASC StEM2scene-referred EXRdecision map
0
0 comments X

The pith

In a controlled cinema workflow, SDR and HDR masters show stable global monotonic luminance correspondence and consistent hue, with 82.4% of pixels closer to scene-referred EXR data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper conducts a pixel-wise case study on the ASC StEM2 dataset, which includes matched EXR, SDR, and HDR masters from the same ACES-based mastering process across 18,580 frames. It establishes that luminance mappings between SDR and HDR are highly stable and monotonic, preserving geometric structure except in specific highlight areas. Color analysis reveals consistent hue with a pattern of saturation changes: suppressed in shadows, expanded in midtones, and converging in highlights. By comparing to the EXR anchor, the authors define regions where the masters allow recovery closer to scene data versus those needing content-adaptive adjustments. This provides a quantitative baseline showing that 82.4% of sampled regions fall into the EXR-closer category, useful for structure-aware conversion methods.

Core claim

SDR and HDR masters exhibit a highly stable global monotonic correspondence in luminance, with geometric structure remaining largely consistent overall, and sparse deviations in self-luminous highlights and specific material regions. In color, the masters remain largely consistent in hue, with saturation exhibiting a redistribution pattern of shadow suppression, midtone expansion, and highlight convergence. Using EXR as a scene-referred anchor, 82.4% of sampled image regions are classified as EXR-closer recovery, while the remainder require localized adaptive adjustment.

What carries the argument

A pixel-level decision map constructed from three-domain (EXR, SDR, HDR) pixel-wise statistics that operationally separates EXR-closer recovery regions from content-adaptive adjustment regions.

If this is right

  • Global monotonic luminance correspondence implies that many SDR-to-HDR conversions can start with simple monotonic functions.
  • Sparse deviations in highlights and materials point to where localized adjustments are essential.
  • The saturation redistribution pattern offers a model for predicting color changes in tone mapping.
  • High proportion of EXR-closer regions (82.4%) suggests HDR masters often retain more scene information than SDR in this workflow.
  • This baseline supports development of learning-based models that account for shared-source mastering conditions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the stability holds across workflows, automated conversion tools could prioritize global mappings and apply local fixes only to identified regions.
  • Testing the same analysis on non-ACES workflows would reveal whether these regularities are workflow-specific or more general.
  • The decision map could serve as training labels for machine learning models to classify regions needing adaptive treatment.
  • Extending this to dynamic video sequences might show temporal consistency in the structural relationships.

Load-bearing premise

The single ACES-based workflow on the ASC StEM2 dataset produces representative structural relationships that can serve as a baseline for other cinema mastering pipelines.

What would settle it

A pixel-wise analysis on a different common-source SDR/HDR/EXR dataset from another mastering workflow showing substantially lower than 82.4% EXR-closer regions or breakdown of the monotonic luminance correspondence would falsify the observed regularities.

Figures

Figures reproduced from arXiv: 2604.06276 by Xiaoyi Chen, Xin Zhang.

Figure 1
Figure 1. Figure 1: Timeline of StEM2. In the content design phase, the film introduced various scenes with high contrast and extreme lighting conditions. For example, the opening cave segment constructs a high-contrast environment with large dark areas and bright point light sources coexisting via LED virtual production; the car￾interior driving segment simulates extreme contrast ratios using heavily overexposed backgrounds;… view at source ↗
Figure 2
Figure 2. Figure 2: Global mapping relationship between SDR and HDR masters in the luminance domain. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: SDR-HDR mapping characteristics of typical scenes. [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Residual clustering in the energy-structure space and physical attribution. [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of SDR and HDR color distributions in ICtCp space. [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Pixel-level decision maps in representative scenes. Green indicates EXR-closer recovery under the [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
read the original abstract

We present an empirical case study of cinema SDR-to-HDR mapping using ASC StEM2, a rare common-source dataset containing EXR scene-referred images and matched SDR/HDR cinema release masters from the same ACES-based mastering workflow. Based on pixel-wise statistics over all 18,580 frames of the test film, we construct a three-domain comparison involving EXR source data, SDR release masters, and HDR release masters to characterize their luminance and color structural relationships within this controlled workflow. In the luminance dimension, SDR and HDR masters exhibit a highly stable global monotonic correspondence, with geometric structure remaining largely consistent overall; sparse and structured deviations appear in self-luminous highlights and specific material regions. In the color dimension, the two masters remain largely consistent in hue, with saturation exhibiting a redistribution pattern of shadow suppression, midtone expansion, and highlight convergence. Using EXR as a scene-referred anchor, we further define a pixel-level decision map that operationally separates EXR-closer recovery regions from content-adaptive adjustment regions. Under this operational definition, 82.4% of sampled image regions are classified as EXR-closer recovery, while the remainder require localized adaptive adjustment. Rather than claiming a universal law for all cinema mastering pipelines, the study provides an interpretable quantitative baseline for structure-aware SDR-to-HDR analysis and for designing learning-based models under shared-source mastering conditions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. This paper conducts a pixel-wise empirical case study on the structural relationships between SDR and HDR cinema masters using the ASC StEM2 dataset, which provides matched EXR scene-referred images and release masters from the same ACES-based workflow across 18,580 frames. Key findings include a highly stable global monotonic correspondence in luminance between SDR and HDR, consistent hue with shadow suppression, midtone expansion, and highlight convergence in saturation, and a pixel-level decision map classifying 82.4% of regions as EXR-closer recovery under the study's operational definition.

Significance. The study offers a useful quantitative baseline for structure-aware SDR-to-HDR analysis in controlled mastering workflows. Strengths include the direct pixel-wise computation over a large number of frames and the use of a common-source dataset, which supports reproducible observations of monotonicity and color patterns. These can inform the design of learning-based models for similar pipelines, though the work positions itself as a case study rather than a universal claim.

major comments (1)
  1. [§4.2] §4.2 (Decision Map and Classification Results): The headline 82.4% EXR-closer classification is produced by an operational pixel-level decision map whose thresholds, normalization, and distance metric are defined within the study. No sensitivity analysis is reported to show how this percentage changes under reasonable alternative definitions of the map (e.g., different color spaces or cutoff values), even though the underlying monotonic luminance correspondence is directly computed and less sensitive to these choices.
minor comments (2)
  1. [Abstract] Abstract: The exact color space, distance metric, and threshold rules for the pixel decision map are not stated, which would allow readers to immediately evaluate the classification claim without consulting the main text.
  2. [§3] §3 (Methods): The description of how the decision map is computed from per-pixel comparisons could be augmented with a concise equation or pseudocode to improve reproducibility of the 82.4% figure.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment of the case study and the constructive suggestion regarding the decision map. We address the comment below and will incorporate additional analysis in the revised manuscript.

read point-by-point responses
  1. Referee: [§4.2] §4.2 (Decision Map and Classification Results): The headline 82.4% EXR-closer classification is produced by an operational pixel-level decision map whose thresholds, normalization, and distance metric are defined within the study. No sensitivity analysis is reported to show how this percentage changes under reasonable alternative definitions of the map (e.g., different color spaces or cutoff values), even though the underlying monotonic luminance correspondence is directly computed and less sensitive to these choices.

    Authors: We agree that reporting sensitivity to the operational choices would strengthen the presentation of the 82.4% figure. The decision map is explicitly defined within the study as an operational tool to separate EXR-closer recovery regions from those requiring content-adaptive adjustment, using the three-domain (EXR-SDR-HDR) pixel statistics. In the revision we will add a sensitivity analysis that varies the distance metric (e.g., Euclidean in linear vs. perceptual spaces such as CIELAB or ACES), normalization constants, and cutoff thresholds around the reported values. We will show that the classification percentage remains stable within a few percentage points under these alternatives, while confirming that the core monotonic luminance correspondence and hue-consistency observations are computed directly from the data and do not depend on the map parameters. revision: yes

Circularity Check

0 steps flagged

No circularity: all claims are direct empirical counts under an explicitly defined operational map

full rationale

The paper is a descriptive case study that computes pixel-wise statistics across the full ASC StEM2 dataset, observes monotonic luminance and hue/saturation patterns, and then applies a self-stated operational definition of an EXR-closer decision map to produce the 82.4% classification count. No equations, fitted parameters, or predictions are present that reduce back to the same inputs by construction. The percentage is simply the fraction of pixels satisfying the authors' own distance-based rule; it is reported as such without any claim of independence from the definition. No self-citations, uniqueness theorems, or ansatzes appear in the provided text. This is a standard empirical reporting structure with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claims rest on empirical pixel statistics from one controlled workflow; the only notable free parameter is the operational threshold set for the decision map, and the main assumption is that the ACES masters are correctly matched to the EXR source.

free parameters (1)
  • decision map thresholds
    The criteria separating EXR-closer recovery regions from content-adaptive adjustment regions are defined operationally and likely involve chosen numerical cutoffs.
axioms (1)
  • domain assumption The provided SDR and HDR masters come from the same ACES-based mastering workflow as the EXR source
    Invoked when treating the three versions as directly comparable for structural analysis.

pith-pipeline@v0.9.0 · 5558 in / 1415 out tokens · 58550 ms · 2026-05-10T19:09:11.922990+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages

  1. [1]

    Digital Cinema Initiatives (DCI).High Dynamic Range D-Cinema Addendum, Version 1.2.1, 2024

  2. [2]

    International Organization for Standardization, Geneva, 2023

    ISO/TS 22028-5:2023.Photography and Graphic Technology — Extended Colour Encodings for Digital Image Storage, Manipulation and Interchange — Part 5: High Dynamic Range (HDR) and Wide Colour Gamut (WCG) Colour Encoding. International Organization for Standardization, Geneva, 2023

  3. [3]

    International Organization for Standardization, Geneva, 2022

    ISO 21496-1:2022.Digital Photography — Gain Map Metadata for Image Conversion — Part 1: Architecture and Requirements. International Organization for Standardization, Geneva, 2022

  4. [4]

    Meininger, S

    J. Meininger, S. Paquet, and P. Longhurst. Cinema HDR: A New Era.SID Symposium Digest of Technical Papers, 50(1):63–66, 2019

  5. [5]

    Cyriac, T

    P. Cyriac, T. Canham, D. Kane, and M. Bertalmío. Vision models fine-tuned by cinema professionals for high dynamic range imaging in movies.Multimedia Tools and Applications, 80(2):2537–2563, 2020

  6. [6]

    Banterle, P

    F. Banterle, P. Ledda, K. Debattista, et al.Advanced High Dynamic Range Imaging: Theory and Practice. AK Peters / CRC Press, 2nd edition, 2017

  7. [7]

    Masiá, A

    B. Masiá, A. Serrano, P. Guthe, et al. A survey on inverse tone mapping.Computer Graphics Forum, 36(1):349–366, 2017

  8. [8]

    Geneva, 2021

    ITU-R.Report ITU-R BT.2446-1: Methods for Conversion of High Dynamic Range Content to Standard Dynamic Range Content and Vice-Versa. Geneva, 2021

  9. [9]

    Standard Evaluation Material II (StEM2) [Online]

    The American Society of Cinematographers (ASC). Standard Evaluation Material II (StEM2) [Online]. Available:https://theasc.com/society/stem2. Accessed: 2026-01-04

  10. [10]

    White Plains, 2021

    SMPTE.ST 2065-1:2021: Academy Color Encoding Specification (ACES). White Plains, 2021

  11. [11]

    White Plains, 2014

    SMPTE.ST 2084:2014: High Dynamic Range Electro-Optical Transfer Function of Mastering Reference Displays (PQ). White Plains, 2014

  12. [12]

    F. M. T. A. Busing. Monotone regression: A simple and fastO(n)PAVA implementation. Journal of Statistical Software, 2022

  13. [13]

    Geneva, 2018

    ITU-R.Recommendation ITU-R BT.2100-3: Image Parameter Values for High Dynamic Range Television for Use in Production and International Programme Exchange. Geneva, 2018. 14

  14. [14]

    Nemoto, P

    H. Nemoto, P. Korshunov, P. Hanhart, and T. Ebrahimi. Visual attention in LDR and HDR images. InProceedings of the IEEE International Workshop on Video Processing and Quality Metrics for Consumer Electronics (VPQM), 2015

  15. [15]

    Junyent, P

    M. Junyent, P. Beltran, M. A. Farre, J. Pont-Tuset, A. Chapiro, and A. Smolic. Video content and structure description based on keyframes, clusters and storyboards. InProceedings of the IEEE 17th International Workshop on Multimedia Signal Processing (MMSP), 2015

  16. [16]

    Geneva, 2023

    ITU-R.Report ITU-R BT.2408-7: Guidance for Operational Practices in HDR Television Production. Geneva, 2023

  17. [17]

    X. Zhang. dcpomatic-hdr: A modified DCP-o-matic workflow with HDR packaging sup- port and SDR-to-HDR processing components [Online]. Available:https://github.com/ zhangzhangco/dcpomatic-hdr. Accessed: 2026-01-04. 15