Review of Measures Used for Evaluating Color Difference Models

Patrick De Visschere

arxiv: 2601.13402 · v3 · pith:DTTXTN53new · submitted 2026-01-19 · ⚛️ physics.optics

Review of Measures Used for Evaluating Color Difference Models

Patrick De Visschere This is my paper

Pith reviewed 2026-05-21 15:53 UTC · model grok-4.3

classification ⚛️ physics.optics

keywords color difference modelsline elementsSTRESScoordinate independenceellipsoid integrationgamma-1 measured_ev error measure

0 comments

The pith

After affine transformation to the unit ball and full integration over ellipsoids, only the STRESS measure for color differences proves coordinate independent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reviews several measures for judging how well theoretical color difference models match experimental data, including V_AB, gamma minus one, CV, and STRESS. To avoid errors from arbitrary direction sampling, it integrates each measure over an entire ellipsoid after applying an affine transformation that turns the theoretical ellipsoid into a unit ball. In the limit of small deviations from circularity all measures agree, but they diverge for larger deviations, with gamma minus one being most sensitive and STRESS least. The analysis shows that only STRESS remains unchanged under coordinate transformations, while the gamma minus one measures uniquely allow a straightforward derivation of the best global difference measure from local definitions. This matters for developing accurate color models because biased measures could lead to inconsistent evaluations depending on the chosen color space coordinates.

Core claim

This review demonstrates that when difference measures are integrated over complete ellipsoids following an affine transformation that normalizes the theoretical line element to the unit ball, only the STRESS measure is independent of the coordinate system chosen. Furthermore, the gamma-1 measures are the only ones that permit deriving the globally optimized difference measure in a simple way from the locally defined ones, and the previously proposed d_ev is shown to be the eigenvalue version of gamma-1. Other measures like the correlation coefficient r are shown to be coordinate dependent, while Pant's 1-R is independent.

What carries the argument

The affine transformation of the theoretical line element to the unit ball combined with integration of the difference measure over the full ellipsoid or ellipse, which eliminates directional sampling artifacts and enables assessment of coordinate independence.

If this is right

All measures become equivalent when deviations from circularity are small.
Gamma-1 is the most sensitive to larger deviations while STRESS is the least sensitive.
The correlation coefficient r is very coordinate dependent and was appropriately abandoned.
Pant's geometric measure 1-R is coordinate independent like STRESS.
A scaling parameter can be optimized to make all measures scale invariant.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Applying this integration method to other perceptual spaces could improve evaluations of sensory models in general.
Preferring coordinate-independent measures like STRESS might lead to more robust color difference formulas across different lighting conditions.
The simple derivation property of gamma-1 could be used to develop hybrid local-global optimization techniques in color science.

Load-bearing premise

That performing an affine transformation to make the theoretical ellipsoid into the unit ball and then integrating the measure over the full shape gives a fair comparison that does not introduce its own biases or artifacts.

What would settle it

A calculation showing that STRESS values change when the coordinate system is rotated or scaled after the affine transformation, or that gamma-1 does not produce the expected global optimum when derived from local measures on a specific data set.

Figures

Figures reproduced from arXiv: 2601.13402 by Patrick De Visschere.

**Figure 2.** Figure 2: The continuous difference measures γ¯ (solid), VAB (dash) , STRESS (dot) and CV (dashdot) as a function of δλ for (i) δγ = δλ and ∆θ = π/2 (blue curves) and for (ii) δγ = 0 and ∆θ = 0 (red curves). In the 2nd case STRESS and CV are identical and VAB almost identical. For the first case the 3 measures VAB, STRESS and CV are also very similar at least up to δ = 0.9, corresponding with an aspect ratio of 1/19… view at source ↗

**Figure 3.** Figure 3: Discrete difference measures γ¯ (solid), VAB (dash) , STRESS (dot) and CV (dashdot) for 2 sampled directions π 3 apart and for 2 rather elongated ellipses (δλ = δγ = 0.9 , ∆θ = π/4). The red curve shows the average measure P F/3. which are almost equal and then also this angle should remain small. This is the limit for b → 0. We will consider this case in § 3. Using basic series expansions the integrands i… view at source ↗

**Figure 4.** Figure 4: The continuous correlation coefficient as a function of [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗

**Figure 5.** Figure 5: The difference measures γ¯µ (solid), VAB,µ (dash) , STRESS (dot) CV µ (dashdot) R¯ (dashed green line) and dev (heavy red line) as a function of δµ for ellipses. VAB,µ and STRESS are almost coincident. The straight line is the mutual asymptote of the first 4 measures ( δµ 2 √ 2 ) for δµ → 0. The slopes of dev ( δµ 2 ) and of R¯ ( 2 π δµ) are larger by factors √ 2 ≈ 1.4 and 4 √ 2 π ≈ 1.8. Note that dev, ST… view at source ↗

**Figure 6.** Figure 6: Comparison between the general continuous measures (in blue) [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: Comparison between the continuous measure [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗

read the original abstract

We made a detailed review of the difference measures which have been used to judge the differences between experimentally determined color differences and theoretically defined ones, so-called line elements, for the human visual system. To eliminate the statistical errors due to variable and usually arbitrary sampling of the directions in a color point, we integrate the measures over a complete ellipsoid/ellipse. It turns out that in the limit for small deviations from circularity all proposed measures ($V_{AB}$, $\gamma-1$, $CV$ and $\mathrm{STRESS}$) are equivalent. For greater deviations the measures become distinct with $\gamma-1$ the most sensitive and $\mathrm{STRESS}$ the least. Ideally a difference measure should be coordinate independent and then it is advantageous to apply an affine transformation to both sets, e.g. turning the theoretical one into the unit ball. Although MacAdam already used this method but sampled the transformed ellipse, we integrate over the ellipsoid/ellipse. Comparing the results with the base measures we show that only $\mathrm{STRESS}$ is coordinate independent. Judging whether a single ellipsoid/ellipse resembles a unit ball can easily be done by comparing the eigenvalues with one and we show that our previously proposed error measure $d_{ev}$ (Candry e.a. Optics Express, 30, 36307, 2022) is the eigenvalue version of $\gamma-1$. We show why the short lived correlation coefficient $r$ was justly abandoned, being very coordinate dependent, but that Pant's recent geometric measure $1-R$ on the other hand is coordinate independent. All measures are routinely made scale invariant by the introduction of a scaling parameter, to be optimized. Lastly we show that from all measures the $\gamma-1$ ones are the only ones permitting the simple derivation of the globally optimized difference measure from the locally defined ones.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript reviews measures (V_AB, γ-1, CV, STRESS, r, 1-R) for comparing experimental color differences to theoretical line elements. It replaces directional sampling with integration over complete ellipsoids/ellipses to remove sampling bias, shows that all measures coincide in the small-deviation-from-circularity limit, and demonstrates that they separate at larger deviations (γ-1 most sensitive, STRESS least). After an affine map that turns the theoretical line element into the unit ball, only STRESS remains coordinate-independent; d_ev is identified as the eigenvalue form of γ-1; and only the γ-1 family permits a direct algebraic passage from locally to globally optimized measures. The correlation coefficient r is shown to be strongly coordinate-dependent while Pant’s 1-R is independent.

Significance. If the integration procedure and the coordinate-independence proof hold, the paper supplies a reproducible, sampling-free protocol for comparing color-difference metrics and isolates coordinate independence as a decisive selection criterion. The small-deviation equivalence result and the explicit link between d_ev and γ-1 are concrete contributions that can be checked numerically. The work also clarifies why certain historically used statistics were abandoned.

major comments (2)

[coordinate independence and MacAdam comparison] Section on coordinate independence and MacAdam comparison: the central claim that only STRESS survives the affine transformation to the unit ball rests on the assertion that full-ellipsoid integration is free of coordinate-dependent artifacts. No explicit demonstration is given that the integration measure remains invariant under the arbitrary affine map for highly eccentric ellipsoids; a counter-example or a proof that the weighting is uniform in the transformed metric would be required to substantiate the claim.
[derivation of globally optimized measure] The statement that γ-1 is the only family permitting a simple derivation of the globally optimized measure from the locally defined ones is load-bearing for the paper’s recommendation of γ-1. The derivation steps (including how the scaling parameter interacts with the integrated measure) are not shown explicitly; without them the uniqueness claim cannot be verified.

minor comments (2)

Notation for the scaling parameter is introduced without a consistent symbol across sections; a single symbol and a brief definition table would improve readability.
The numerical examples that illustrate the divergence of the measures for large deviations should be collected in a single table with explicit eccentricity values and the resulting measure differences.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment point by point below. Where the manuscript lacks explicit demonstrations or derivations, we will revise accordingly to strengthen the presentation.

read point-by-point responses

Referee: Section on coordinate independence and MacAdam comparison: the central claim that only STRESS survives the affine transformation to the unit ball rests on the assertion that full-ellipsoid integration is free of coordinate-dependent artifacts. No explicit demonstration is given that the integration measure remains invariant under the arbitrary affine map for highly eccentric ellipsoids; a counter-example or a proof that the weighting is uniform in the transformed metric would be required to substantiate the claim.

Authors: We agree that the manuscript would benefit from an explicit demonstration that the full-ellipsoid integration measure is invariant under arbitrary affine transformations, particularly for highly eccentric cases. While the paper shows through direct computation that STRESS yields identical results before and after the transformation to the unit ball (unlike the other measures), and contrasts this with MacAdam's directional sampling approach, a general invariance argument is not supplied. In the revised manuscript we will add a short appendix containing a proof: under an affine map with Jacobian J the volume element transforms by |det J|, but because the integration domain is the complete ellipsoid that maps exactly onto the unit ball, the normalized integrand for STRESS remains unchanged, confirming coordinate independence. This addition directly addresses the referee's request and solidifies the central claim. revision: yes
Referee: The statement that γ-1 is the only family permitting a simple derivation of the globally optimized measure from the locally defined ones is load-bearing for the paper’s recommendation of γ-1. The derivation steps (including how the scaling parameter interacts with the integrated measure) are not shown explicitly; without them the uniqueness claim cannot be verified.

Authors: We accept that the explicit algebraic steps deriving the globally optimized γ-1 measure from its local definition are only summarized rather than fully expanded in the manuscript. The paper notes that γ-1 alone allows a direct passage once a scaling parameter is introduced and the measure is integrated over the ellipsoid, but the interaction between the scaling factor and the integrated quantity is not written out. In the revision we will insert a dedicated subsection that starts from the local expression, introduces the scaling parameter λ, shows how λ is chosen to minimize the integrated measure, and arrives at the closed-form global expression. This step-by-step derivation will make the uniqueness claim verifiable and will clarify why the other families do not admit an equally simple global extension. revision: yes

Circularity Check

0 steps flagged

Minor self-citation present but not load-bearing; central comparisons independently derived via integration

full rationale

The paper conducts a review of color difference measures by proposing integration over complete ellipsoids/ellipses after affine transformation to assess coordinate independence and sensitivity. Claims that only STRESS is coordinate independent, that γ-1 is most sensitive for larger deviations, and that γ-1 uniquely permits deriving globally optimized measures from local ones follow directly from these explicit integrations and limit comparisons, without reducing to fitted parameters or self-citation chains. The single reference to prior work defining d_ev as the eigenvalue version of γ-1 is a minor self-citation that does not support the load-bearing results. The analysis is self-contained against the reviewed literature and does not match any enumerated circularity patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Relies on standard mathematical properties of ellipsoids and affine transformations in color space; introduces a scaling parameter for scale invariance.

free parameters (1)

scaling parameter
Added to each measure to achieve scale invariance; optimized during comparison.

axioms (1)

domain assumption Integration over the complete ellipsoid/ellipse removes statistical errors from arbitrary directional sampling
Invoked to justify the proposed evaluation method over point sampling.

pith-pipeline@v0.9.0 · 5862 in / 1259 out tokens · 51889 ms · 2026-05-21T15:53:56.205393+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

To eliminate the statistical errors due to variable and usually arbitrary sampling of the directions in a color point, we integrate the measures over a complete ellipsoid/ellipse... only STRESS is coordinate independent... d_ev is the eigenvalue version of γ−1
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

All measures are routinely made scale invariant by the introduction of a scaling parameter... γ−1 ones are the only ones permitting the simple derivation of the globally optimized difference measure from the locally defined ones

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.