Helmlab: A Two-Space Family of Analytical, Data-Driven Color Spaces for UI Design Systems
Pith reviewed 2026-05-15 19:21 UTC · model grok-4.3
The pith
MetricSpace, a 72-parameter analytical color space, cuts color-difference error by 23 percent versus CIEDE2000 on UI-relevant pairs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Helmlab supplies two purpose-built color spaces that share the same analytical forward transform from CIE XYZ: MetricSpace (72 parameters) optimized for color-difference prediction and GenSpace (44 parameters) optimized for generation tasks. On the COMBVD benchmark MetricSpace reaches STRESS 22.48 against CIEDE2000’s 29.20; averaged across three primary datasets it scores 21.75 versus the next-best baseline at 35.98. The pipeline includes per-channel compression, Fourier hue correction, embedded lightness adjustment, neutral-axis correction, and an isometry that preserves distances while aligning hue angles. Both spaces are exactly invertible with round-trip error below 1e-13.
What carries the argument
An 11-stage analytical transform from CIE XYZ that chains learned linear matrices, per-channel power compression, Fourier-series hue correction, Helmholtz-Kohlrausch lightness adjustment, neutral-axis correction, and a rigid chromatic-plane rotation.
If this is right
- Design tools can replace CIEDE2000 with MetricSpace for more accurate contrast and harmony checks without changing existing code paths.
- Palette and gradient generators can switch to GenSpace to improve smoothness and uniformity across wide-gamut spaces.
- The shared invertible pipeline allows round-tripping between any of the three spaces with negligible error.
- A single rotation of the chromatic plane can be applied to align hue angles for specific brand palettes while leaving the distance metric unchanged.
Where Pith is reading between the lines
- Because the pipeline is fully analytical, new parameter sets could be derived for other domains such as medical imaging or material appearance without retraining the entire structure.
- The neutral-axis correction and Fourier hue term may generalize to other perceptually motivated transforms that currently rely on ad-hoc fixes.
- Exact invertibility opens the possibility of using these spaces as intermediate representations inside color-managed rendering pipelines.
Load-bearing premise
The optimized parameters will continue to perform well on new UI color pairs and screen conditions that were not seen during tuning.
What would settle it
Collect a fresh set of several thousand color-pair judgments under typical UI viewing conditions and measure whether MetricSpace STRESS remains below 24 while CIEDE2000 stays above 29.
Figures
read the original abstract
We present Helmlab, a family of two purpose-built color spaces for UI design systems sharing a common 11-stage analytical structure: MetricSpace, a 72-parameter space optimized for color-difference prediction, and GenSpace, a 44-parameter space optimized for gradient and palette generation. The forward transform maps CIE XYZ to a perceptually-organized Lab representation through learned matrices, per-channel power compression, Fourier hue correction, and embedded Helmholtz-Kohlrausch lightness adjustment. A post-pipeline neutral correction holds gray-axis chroma below 1e-5 on a 21-step ramp, and a rigid rotation of the chromatic plane improves hue-angle alignment without affecting the distance metric (which is invariant under isometries). On COMBVD (3,813 color pairs), MetricSpace v21 achieves STRESS 22.48, a 23 percent reduction from CIEDE2000 (29.20). On the held-out MacAdam 1974 dataset it scores 19.51 (CIEDE2000: 22.13; CAM16-UCS leads at 18.71). On a self-collected 3,552-judgement screen-condition set it scores 23.26 vs 62.54 for CIEDE2000. On academic He et al. 2022 (82 3D-printed pairs) MetricSpace scores 35.9 vs CIEDE2000 32.6, a regression we own. Averaging the three primary datasets, MetricSpace scores 21.75 vs the next-best baseline CIECAM02-UCS at 35.98. GenSpace v0.11.1 trades distance accuracy for generation quality: on a 90-metric, 3,038-pair gradient/palette benchmark across sRGB, P3, and Rec.2020, it wins 65 of 90 vs OKLab. The transform is invertible with round-trip errors below 1e-13. Production implementations ship on PyPI, npm, Color.js (PR 722, merged), and as a PostCSS plugin.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Helmlab, a two-space family of analytical color spaces for UI design: MetricSpace (72 parameters) optimized for color-difference prediction via an 11-stage forward transform from CIE XYZ using learned matrices, per-channel powers, Fourier hue terms, and embedded Helmholtz-Kohlrausch adjustment; and GenSpace (44 parameters) optimized for gradient/palette generation. It reports STRESS of 22.48 on COMBVD (23% below CIEDE2000), 19.51 on held-out MacAdam 1974, 23.26 on a self-collected screen set, but 35.9 (worse than CIEDE2000's 32.6) on He et al. 2022; GenSpace wins 65/90 metrics on a 3,038-pair generation benchmark. The transforms are invertible to 1e-13 error with production implementations provided.
Significance. If the gains hold out-of-sample, the work offers practical, invertible color spaces that combine analytical structure with data-driven tuning for UI workflows, supported by shipped code in PyPI, npm, Color.js, and PostCSS. The explicit handling of neutral-axis correction and isometry-invariant distances is a constructive contribution to perceptually organized spaces.
major comments (3)
- [Abstract, optimization section] Abstract and optimization section: MetricSpace's 72 parameters (learned matrices, per-channel powers, Fourier hue terms, HK adjustment) are optimized directly on COMBVD (3,813 pairs), the same dataset used to claim the primary 23% STRESS reduction (22.48 vs CIEDE2000 29.20); this in-sample fitting makes the reported superiority dependent on the training distribution rather than demonstrated generalization.
- [Results section] Results on held-out and secondary sets: On MacAdam 1974 the gain shrinks to ~12% (19.51 vs 22.13) with CAM16-UCS better at 18.71; on He et al. 2022 MetricSpace regresses to 35.9 vs CIEDE2000 32.6. These outcomes indicate the fitting does not reliably outperform established models on unseen data.
- [Methods / optimization section] Parameter fitting protocol: No cross-validation, regularization, or sensitivity analysis is described for the 72/44 parameters; without these, the claim that the analytical structure plus fitting yields robust UI-specific spaces cannot be evaluated.
minor comments (2)
- [Transform description] The 11-stage pipeline description would benefit from an explicit equation or diagram numbering each stage (matrix, power, Fourier, HK, neutral correction, rotation) to clarify data flow.
- [Generation results] Table or figure presenting the 90-metric generation benchmark should list the exact metrics and per-dataset breakdowns rather than aggregate win count.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. We address each major comment below, acknowledging the in-sample nature of the primary optimization while pointing to the held-out evaluations already present in the manuscript. We commit to revisions that clarify these points and strengthen the methods description.
read point-by-point responses
-
Referee: [Abstract, optimization section] Abstract and optimization section: MetricSpace's 72 parameters (learned matrices, per-channel powers, Fourier hue terms, HK adjustment) are optimized directly on COMBVD (3,813 pairs), the same dataset used to claim the primary 23% STRESS reduction (22.48 vs CIEDE2000 29.20); this in-sample fitting makes the reported superiority dependent on the training distribution rather than demonstrated generalization.
Authors: We agree that the primary STRESS reduction of 23% is reported on the COMBVD dataset used for optimization. The manuscript already reports performance on the held-out MacAdam 1974 set (19.51 vs CIEDE2000 22.13) and the self-collected screen set (23.26 vs 62.54), which provide evidence of generalization to other conditions. The regression on He et al. 2022 is explicitly noted in the paper. In revision we will add explicit language in the abstract and optimization section distinguishing the training set from the held-out evaluations and discuss the implications of in-sample optimization for UI-tuned spaces. revision: partial
-
Referee: [Results section] Results on held-out and secondary sets: On MacAdam 1974 the gain shrinks to ~12% (19.51 vs 22.13) with CAM16-UCS better at 18.71; on He et al. 2022 MetricSpace regresses to 35.9 vs CIEDE2000 32.6. These outcomes indicate the fitting does not reliably outperform established models on unseen data.
Authors: The manuscript already presents these exact results transparently, including the regression on He et al. 2022 which we own. On MacAdam 1974 we still improve over CIEDE2000, and the large gain on the screen-condition set supports the UI-specific tuning. We will expand the results discussion to explain dataset differences (e.g., 3D-printed vs. screen stimuli and viewing conditions) and why MetricSpace prioritizes screen/UI performance over uniform outperformance on all academic sets. revision: partial
-
Referee: [Methods / optimization section] Parameter fitting protocol: No cross-validation, regularization, or sensitivity analysis is described for the 72/44 parameters; without these, the claim that the analytical structure plus fitting yields robust UI-specific spaces cannot be evaluated.
Authors: We accept this criticism. The revised manuscript will include a dedicated subsection on the fitting protocol, detailing the optimization procedure, any regularization applied, and a sensitivity analysis on key parameters (e.g., matrix entries and power exponents). We will also report cross-validation results on COMBVD splits to quantify robustness. revision: yes
Circularity Check
72-parameter optimization on COMBVD and related benchmarks ties primary STRESS claims to in-sample fitting
specific steps
-
fitted input called prediction
[Abstract, paragraph 2]
"a 72-parameter space optimized for color-difference prediction... On COMBVD (3,813 color pairs), MetricSpace v21 achieves STRESS 22.48, a 23 percent reduction from CIEDE2000 (29.20)."
The optimization target is color-difference prediction; the primary reported metric is STRESS on COMBVD, which participates in that optimization. The reduction is therefore a measure of how well the fitted parameters reproduce the fitting data rather than an independent prediction on unseen conditions.
-
fitted input called prediction
[Abstract, paragraph 3]
"GenSpace v0.11.1 trades distance accuracy for generation quality: on a 90-metric, 3,038-pair gradient/palette benchmark across sRGB, P3, and Rec.2020, it wins 65 of 90 vs OKLab."
The 44 parameters are optimized for gradient and palette generation; the win count is reported on the identical benchmark class, rendering the superiority a direct consequence of the fitting objective.
full rationale
The paper explicitly optimizes MetricSpace's 72 parameters (learned matrices, per-channel powers, Fourier terms, HK adjustment) for color-difference prediction and reports the headline 23% STRESS reduction on COMBVD, the same class of data used in fitting. While MacAdam 1974 is labeled held-out and shows smaller gains, the central result and three-dataset average remain dependent on the fitting target without referenced cross-validation or regularization. GenSpace's 44 parameters follow the same pattern on its gradient/palette benchmark. This matches the fitted_input_called_prediction pattern but does not collapse the entire derivation to a tautology, as the analytical structure (11-stage transform, invertibility) retains independent content.
Axiom & Free-Parameter Ledger
free parameters (2)
- 72 parameters in MetricSpace
- 44 parameters in GenSpace
axioms (2)
- standard math CIE XYZ tristimulus values serve as the input representation
- domain assumption The sequence of matrix, power, Fourier, and rotation steps produces a perceptually organized space
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
MetricSpace v21: a 72-parameter analytical color space... learned matrices, per-channel power compression, Fourier hue correction, and embedded Helmholtz–Kohlrausch lightness adjustment
-
IndisputableMonolith/Foundation/BranchSelection.leanbranch_selection unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The v21-optimized values are γ=[0.472,0.515,0.511]... The distance metric... d=[(ΔL/SL)² + wC(Δa²+Δb²)/SC²]^(p/2) followed by [d/(1+c·d)]^q
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
On COMBVD (3,813 color pairs), MetricSpace v21 achieves STRESS 22.48, a 23 percent reduction from CIEDE2000 (29.20)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
CIE, “Colorimetry,” CIE Publication 15, Vi- enna, 1976. Latest revision: CIE 015:2018, ISBN 978-3-902842-13-8.https://cie.co.at/ publications/colorimetry-4th-edition
work page 1976
-
[2]
The develop- ment of the CIE 2000 colour-difference formula: CIEDE2000,
M. R. Luo, G. Cui, and B. Rigg, “The develop- ment of the CIE 2000 colour-difference formula: CIEDE2000,”Color Res. Appl., vol. 26, no. 5, pp. 340–350, 2001. doi:10.1002/col.1049
-
[3]
A perceptual color space for image processing,
B. Ottosson, “A perceptual color space for image processing,” blog post, December 2020.https: //bottosson.github.io/posts/oklab/. Adopted in CSS Color Module 4 (W3C Candidate Recommendation Draft, 2026)
work page 2020
-
[4]
Comprehensive color solutions: CAM16, CAT16, and CAM16-UCS,
C. Li, Z. Li, Z. Wang,et al., “Comprehensive color solutions: CAM16, CAT16, and CAM16-UCS,” Color Res. Appl., vol. 42, no. 6, pp. 703–718, 2017. doi:10.1002/col.22131
-
[5]
Development and testing of a color space (IPT) with improved hue uniformity,
F. Ebner and M. D. Fairchild, “Development and testing of a color space (IPT) with improved hue uniformity,” inProc. IS&T 6th Color Imaging Con- ference (CIC), Scottsdale, AZ, 1998, pp. 8–13
work page 1998
-
[6]
Per- ceptually uniform color space for image signals in- cluding high dynamic range and wide gamut,
M. Safdar, G. Cui, Y . J. Kim, and M. R. Luo, “Per- ceptually uniform color space for image signals in- cluding high dynamic range and wide gamut,”Opt. Express, vol. 25, no. 13, pp. 15131–15151, 2017. doi:10.1364/OE.25.015131
-
[7]
Uniform colour spaces based on CIECAM02 colour appearance model,
M. R. Luo, G. Cui, and C. Li, “Uniform colour spaces based on CIECAM02 colour appearance model,”Color Res. Appl., vol. 31, no. 4, pp. 320– 330, 2006. doi:10.1002/col.20227
-
[8]
Colour dif- ference evaluation using large colour differences,
R. He, G. Cui, T. Zhu, and M. R. Luo, “Colour dif- ference evaluation using large colour differences,” inProc. 30th CIE Session, Ljubljana, 2022
work page 2022
-
[9]
D. L. MacAdam, “Uniform color scales,”J. Opt. Soc. Am., vol. 64, no. 12, pp. 1691–1702, 1974. doi:10.1364/JOSA.64.001691
-
[10]
Alimitedmem- ory algorithm for bound constrained optimization
R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu, “A lim- ited memory algorithm for bound constrained op- timization,”SIAM J. Sci. Comput., vol. 16, no. 5, pp. 1190–1208, 1995. doi:10.1137/0916069
-
[11]
Evaluation of color difference prediction with CIECAM16 using CIE 2- and 10-degree ob- servers,
Y . Gao, M. R. Luo, M. R. Pointer, and C. Li, “Evaluation of color difference prediction with CIECAM16 using CIE 2- and 10-degree ob- servers,”J. Imaging Sci. Technol., vol. 67, no. 2, art. no. 020401, 2023
work page 2023
-
[12]
L. Verou and C. Lilley,Color .js: A library for color conversions, manipulation, and difference computation, version 0.5.x, 2026. Project page: https://colorjs.io/; source repository: https://github.com/color-js/color.js; HELMLABintegration merged in PR #722, https://github.com/color-js/color.js/ pull/722, 2026-05-04
work page 2026
-
[13]
G. Yıldız,helmlab: Data-driven analytical color space for perceptual color difference, software package version 0.12.2, 2026. PyPI:https:// pypi.org/project/helmlab/; npm:https: 14 //www.npmjs.com/package/helmlab; source: https://github.com/Grkmyldz148/helmlab
work page 2026
-
[14]
G. Yıldız,postcss-helmlab: PostCSS plugin for HELMLABCSS color functions, software pack- age version 0.1.x, 2026.https://www.npmjs. com/package/postcss-helmlab; source: https://github.com/Grkmyldz148/helmlab/ tree/main/packages/postcss-helmlab
work page 2026
-
[15]
Mansencal et al.,Colour: Science soft- ware for the colour processing community, version 0.4.x, 2026
T. Mansencal et al.,Colour: Science soft- ware for the colour processing community, version 0.4.x, 2026. Project page:https: //www.colour-science.org/; source reposi- tory:https://github.com/colour-science/ colour; reference implementations of CAM16- UCS, CIECAM02-UCS,J zazbz, DIN99 and IPT delta-E used as canonical baselines in this paper
work page 2026
-
[16]
B. Ottosson, “OkLCh gamut clipping,” technical note, May 2021.https://bottosson.github. io/posts/gamutclipping/; also seehttps:// bottosson.github.io/posts/colorpicker/ (November 2021) and the public discussion thread athttps://github.com/color-js/color.js/ issues/81(2022–2024). A Parameter Table (MetricSpace v21) Table 6 lists all 72 trained MetricSpace ...
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.