GC-ART: Global Learnable Second-Order Rational Tone Curves for Illumination Robustness

Joyce Huang; Wei Huang

arxiv: 2605.07329 · v1 · submitted 2026-05-08 · 💻 cs.CV

GC-ART: Global Learnable Second-Order Rational Tone Curves for Illumination Robustness

Wei Huang , Joyce Huang This is my paper

Pith reviewed 2026-05-11 01:17 UTC · model grok-4.3

classification 💻 cs.CV

keywords illumination robustnesstone mappingimage classificationrational curveshistogram conditioningdifferentiable preprocessingcontrast enhancement

0 comments

The pith

A lightweight module using rational tone curves predicted from histograms matches clean-image accuracy while improving robustness to darkening and contrast changes in image classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a simple global adjustment to image brightness and contrast, learned from the image's color histograms, can help neural networks classify images more reliably when lighting is uneven or poor. It does this by having a tiny neural network predict the shape of a rational function that maps input pixel values to output values for each color channel. The function is applied the same way to every pixel, which keeps edges sharp by design. Because the whole system trains together, the adjustments focus on what helps the final classification task. This matters if true because it suggests cheap, easy-to-add steps can fix many real-world lighting problems without redesigning the main network or using heavy local processing.

Core claim

GC-ART predicts an endpoint-pinned rational tone curve from per-channel soft histograms using a 643-parameter MLP, then applies the curve pointwise before the classifier. The module is trained end-to-end with cross-entropy and a soft monotonicity penalty. On CIFAR-10 with a CIFAR-style ResNet-18, GC-ART matches clean accuracy with the baseline, improves over the baseline on multiplicative darkening, and achieves the best learned-method result on contrast corruption.

What carries the argument

The endpoint-pinned rational tone curve predicted by a small MLP from per-channel soft histograms, applied pointwise to correct global illumination.

Load-bearing premise

A single global per-channel rational tone curve derived from soft histograms provides sufficient correction for illumination variations in classification.

What would settle it

Observing no improvement or degradation on a benchmark with spatially varying illumination corruptions would indicate that global curves alone are insufficient.

Figures

Figures reproduced from arXiv: 2605.07329 by Joyce Huang, Wei Huang.

**Figure 1.** Figure 1: Accuracy as a function of corruption severity for the four learned systems on CIFAR-10-C-style brightness, contrast, and darkening corruptions. Mean and standard deviation over 3 seeds. Module Params Total FLOPs (32) GC-ART 643 269,088 Zero-DCE 11,011 11,252,736 Zero-DCE++ 1,953 1,908,736 Histogram Equaliz. 0 19,200 Gamma (γ=2.2) 0 6,144 [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

read the original abstract

We introduce GC-ART (Global Curve Adaptive Rational Tone-mapping), a lightweight differentiable pre-processing module for robust image classification. GC-ART predicts an endpoint-pinned rational tone curve from per-channel soft histograms using a 643-parameter MLP, then applies the curve pointwise before the classifier. The module is trained end-to-end with cross-entropy and a soft monotonicity penalty. On CIFAR-10 with a CIFAR-style ResNet-18, GC-ART matches clean accuracy with the unenhanced baseline and other learned enhancers, improves over the baseline on multiplicative darkening, and achieves the best learned-method result on contrast corruption (48.45% vs. 46.27% for the baseline and 47.13% for Zero-DCE++). These results suggest that histogram-conditioned rational curves can learn useful global tone corrections, including contrast-expanding behavior, while preserving edge locations by construction through pointwise mapping. GC-ART also uses substantially fewer FLOPs than convolutional learned enhancers at 32 x 32. The current hyperparameters are untuned, leaving room for systematic improvement.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GC-ART gives a cheap, differentiable global tone-curve layer with small gains on synthetic CIFAR-10 lighting corruptions, but the evidence is too thin and the scope too narrow to support broader illumination-robustness claims.

read the letter

The paper's core idea is a 643-parameter MLP that reads soft per-channel histograms and outputs an endpoint-pinned second-order rational tone curve, applied pointwise before the classifier. It trains end-to-end with cross-entropy plus a soft monotonicity penalty. On CIFAR-10 with a standard ResNet-18 it keeps clean accuracy flat while improving over the baseline on multiplicative darkening and reaching the best reported learned-method number on contrast corruption (48.45 %). The module is genuinely lightweight and uses far fewer FLOPs than convolutional enhancers at 32x32 resolution, which is a practical plus for anyone who wants a plug-in pre-processor rather than a heavier network redesign. The rational-curve choice and histogram conditioning are a clean, specific combination that avoids the usual gamma or spline alternatives while guaranteeing monotonicity by construction. That part is new enough and executed simply enough to be worth noting. The experimental support is the main weakness. The abstract supplies only point estimates with no run-to-run variance, no statistical tests, and no ablation of the monotonicity term or training details. All reported corruptions are spatially uniform by design, so the 2-point contrast gain may simply reflect better global histogram matching rather than any real handling of local shadows or uneven lighting. The stress-test concern holds: nothing in the current setup provides spatial selectivity, and the paper offers no evidence on datasets with realistic non-uniform illumination. This leaves the title's claim of illumination robustness resting on an untested extrapolation. The work is for people building lightweight robustness modules or studying efficient pre-processing layers. It shows clear, honest engineering thinking on keeping the module small and monotonic, but the results are preliminary. I would send it to peer review if the authors add variance numbers, ablations, and at least one test on local or real-world lighting corruptions; otherwise it is still too light for a full referee round.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces GC-ART, a lightweight differentiable pre-processing module that predicts an endpoint-pinned rational tone curve per channel from soft histograms via a 643-parameter MLP, applies the curve pointwise, and is trained end-to-end with cross-entropy plus a soft monotonicity penalty. On CIFAR-10 with a CIFAR-style ResNet-18, it matches the unenhanced baseline on clean data, improves over the baseline on multiplicative darkening, and reports the best accuracy among learned methods on contrast corruption (48.45% vs. 46.27% baseline and 47.13% for Zero-DCE++), while using substantially fewer FLOPs than convolutional enhancers at 32x32 resolution.

Significance. If the empirical claims hold under more rigorous validation, the work demonstrates that a very small global histogram-conditioned rational curve module can deliver competitive robustness gains on selected global illumination corruptions with minimal overhead. This could be useful for efficient, low-parameter robustness pipelines, but the significance is limited by the narrow scope of the tested corruptions and the absence of statistical support for the reported gains.

major comments (2)

Experimental results: the reported accuracies on corrupted CIFAR-10 (including the 2.18 pp gain on contrast corruption) are given as single point estimates without run-to-run variance, error bars, statistical tests, or full details of the training protocol and hyperparameter choices. This directly weakens support for the central performance claims relative to the baseline and Zero-DCE++.
Method description and evaluation setup: the core assumption that a single global per-channel rational curve (regressed from soft histograms) suffices for illumination robustness is load-bearing for the title and abstract claims, yet the evaluation uses only spatially uniform synthetic corruptions. No experiments or discussion address local effects such as shadows or non-uniform lighting, leaving the extrapolation from global to general illumination robustness untested.

minor comments (2)

Abstract: the statement that GC-ART 'uses substantially fewer FLOPs' lacks a concrete number or reference to a comparison table.
Notation and equations: the exact functional form of the second-order rational tone curve, the definition of the soft histogram input, and the implementation of the monotonicity penalty should be given explicitly with numbered equations for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on experimental rigor and evaluation scope. We respond to each major comment below and describe the revisions we will implement.

read point-by-point responses

Referee: Experimental results: the reported accuracies on corrupted CIFAR-10 (including the 2.18 pp gain on contrast corruption) are given as single point estimates without run-to-run variance, error bars, statistical tests, or full details of the training protocol and hyperparameter choices. This directly weakens support for the central performance claims relative to the baseline and Zero-DCE++.

Authors: We agree that single-run point estimates limit the strength of the reported gains. In the revised manuscript we will rerun all experiments with a minimum of five independent random seeds, report mean accuracy and standard deviation for every setting, add error bars to the relevant tables and figures, and include paired statistical tests (e.g., t-tests) to assess whether the observed improvements over the baseline and Zero-DCE++ are significant. Complete training protocols, hyperparameter values, and data-augmentation details will be moved to the supplementary material. revision: yes
Referee: Method description and evaluation setup: the core assumption that a single global per-channel rational curve (regressed from soft histograms) suffices for illumination robustness is load-bearing for the title and abstract claims, yet the evaluation uses only spatially uniform synthetic corruptions. No experiments or discussion address local effects such as shadows or non-uniform lighting, leaving the extrapolation from global to general illumination robustness untested.

Authors: The method is deliberately restricted to global, per-channel tone curves conditioned on whole-image histograms; this design matches the spatially uniform corruptions we evaluate (multiplicative darkening and contrast). We will revise the title, abstract, and introduction to state explicitly that the work targets global illumination robustness. We will also add a dedicated limitations paragraph clarifying that spatially varying effects such as shadows or non-uniform lighting fall outside the current global-curve formulation and would require local adaptation methods. No new experiments on non-uniform lighting are planned for this revision, as they lie beyond the intended scope. revision: partial

Circularity Check

0 steps flagged

No circularity detected in the derivation chain

full rationale

The paper introduces GC-ART as an end-to-end trainable MLP that regresses an endpoint-pinned rational tone curve from per-channel soft histograms and applies it pointwise before classification. Training uses standard cross-entropy plus a monotonicity penalty on held-out CIFAR-10 data; the reported accuracy numbers are direct empirical measurements on benchmark corruptions rather than quantities defined by the fitted parameters themselves. No self-citations are invoked to justify uniqueness or to close a derivation loop, no ansatz is smuggled via prior work, and no prediction is statistically forced by construction from a subset of the same data. The central claim therefore remains an independent empirical result.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard neural-network training assumptions plus the domain assumption that a monotonic rational curve can capture useful global tone corrections; no new physical entities are postulated.

free parameters (1)

MLP weights (643 parameters)
All weights of the histogram-to-curve MLP are fitted to the training data via gradient descent.

axioms (1)

domain assumption A soft monotonicity penalty is sufficient to produce valid non-decreasing tone curves
Invoked in the training objective to ensure the learned rational function remains a proper tone-mapping curve.

pith-pipeline@v0.9.0 · 5487 in / 1443 out tokens · 57049 ms · 2026-05-11T01:17:50.613097+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

GC-ART predicts an endpoint-pinned rational tone curve from per-channel soft histograms using a 643-parameter MLP, then applies the curve pointwise... f(x;a, b, d, e) = a x² + b x / (d x² + e x + 1)
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean J_uniquely_calibrated_via_higher_derivative unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The rational family is intended to provide a compact curve class that can represent multiple exposure corrections... concave shadow-lifting, convex highlight-compressing, sigmoidal...

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

[1]

Zero-reference deep curve estimation for low- light image enhancement,

C. Guo, C. Li, J. Guo, C. C. Loy, J. Hou, S. Kwong, and R. Cong, “Zero-reference deep curve estimation for low- light image enhancement,” inProc. IEEE/CVF CVPR, 2020, pp. 1780–1789

work page 2020
[2]

Deep residual learn- ing for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learn- ing for image recognition,” inProc. IEEE CVPR, 2016, pp. 770–778

work page 2016
[3]

Benchmarking neural net- work robustness to common corruptions and perturbations,

D. Hendrycks and T. Dietterich, “Benchmarking neural net- work robustness to common corruptions and perturbations,” inICLR, 2019

work page 2019
[4]

Toward fast, flexible, and robust low-light image enhancement,

L. Ma, T. Ma, R. Liu, X. Fan, and Z. Luo, “Toward fast, flexible, and robust low-light image enhancement,” inProc. IEEE/CVF CVPR, 2022, pp. 5637–5646

work page 2022
[5]

Learning multiple layers of features from tiny images,

A. Krizhevsky, “Learning multiple layers of features from tiny images,” Tech. Rep., University of Toronto, 2009

work page 2009
[6]

Deep Retinex decomposition for low-light enhancement,

C. Wei, W. Wang, W. Yang, and J. Liu, “Deep Retinex decomposition for low-light enhancement,” inBMVC, 2018

work page 2018
[7]

Kindling the darkness: A practical low-light image enhancer,

Y . Zhang, J. Zhang, and X. Guo, “Kindling the darkness: A practical low-light image enhancer,” inACM MM, 2019, pp. 1632–1640

work page 2019
[8]

LLNet: A deep autoencoder approach to natural low-light image enhance- ment,

K. G. Lore, A. Akintayo, and S. Sarkar, “LLNet: A deep autoencoder approach to natural low-light image enhance- ment,”Pattern Recognition, vol. 61, pp. 650–662, 2017

work page 2017
[9]

R. C. Gonzalez and R. E. Woods,Digital Image Processing, 4th ed., Pearson, 2018

work page 2018
[10]

Contrast limited adaptive histogram equal- ization,

K. Zuiderveld, “Contrast limited adaptive histogram equal- ization,” inGraphics Gems IV, P. S. Heckbert, Ed., pp. 474– 485, Academic Press, 1994

work page 1994
[11]

Photo- graphic tone reproduction for digital images,

E. Reinhard, M. Stark, P. Shirley, and J. Ferwerda, “Photo- graphic tone reproduction for digital images,”ACM Trans. Graphics, vol. 21, no. 3, pp. 267–276, 2002

work page 2002
[12]

Adap- tive logarithmic mapping for displaying high contrast scenes,

F. Drago, K. Myszkowski, T. Annen, and N. Chiba, “Adap- tive logarithmic mapping for displaying high contrast scenes,”Computer Graphics F orum, vol. 22, no. 3, pp. 419– 426, 2003

work page 2003
[13]

Uncharted 2: HDR lighting,

J. Hable, “Uncharted 2: HDR lighting,” presented at the Game Developers Conference (GDC), San Francisco, CA, 2010

work page 2010
[14]

ACES filmic tone mapping curve,

K. Narkowicz, “ACES filmic tone mapping curve,” 2016, https://knarkowicz.wordpress.com/2016/ 01/06/aces-filmic-tone-mapping-curve/

work page 2016
[15]

TGTM: TinyML-based global tone mapping for HDR sensors,

P. Todorov, J. Hartig, J. Meyer-Siemon, M. Fiedler, and G. Schewior, “TGTM: TinyML-based global tone mapping for HDR sensors,”arXiv preprint arXiv:2405.05016, 2024

work page arXiv 2024
[16]

Learn- ing image-adaptive 3D lookup tables for high performance photo enhancement in real-time,

H. Zeng, J. Cai, L. Li, Z. Cao, and L. Zhang, “Learn- ing image-adaptive 3D lookup tables for high performance photo enhancement in real-time,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 4, pp. 2058–2073, 2022

work page 2058
[17]

Deep bilateral learning for real-time image enhance- ment,

M. Gharbi, J. Chen, J. T. Barron, S. W. Hasinoff, and F. Du- rand, “Deep bilateral learning for real-time image enhance- ment,”ACM Trans. Graphics, vol. 36, no. 4, pp. 118:1– 118:12, 2017. 7

work page 2017
[18]

Learning to see in the dark,

C. Chen, Q. Chen, J. Xu, and V . Koltun, “Learning to see in the dark,” inProc. IEEE/CVF CVPR, 2018, pp. 3291–3300

work page 2018
[19]

Exposure: A white-box photo post-processing framework,

Y . Hu, H. He, C. Xu, B. Wang, and S. Lin, “Exposure: A white-box photo post-processing framework,”ACM Trans. Graphics, vol. 37, no. 2, pp. 26:1–26:17, 2018

work page 2018
[20]

HyperNetworks,

D. Ha, A. M. Dai, and Q. V . Le, “HyperNetworks,” inICLR, 2017

work page 2017
[21]

Learning deep embeddings with histogram loss,

E. Ustinova and V . Lempitsky, “Learning deep embeddings with histogram loss,” inNeurIPS, 2016, pp. 4170–4178

work page 2016
[22]

DeepHist: Differentiable joint pyramidal histogram layer for robust visual recognition,

M. Avi-Aharon, A. Arbelle, and T. R. Raviv, “DeepHist: Differentiable joint pyramidal histogram layer for robust visual recognition,”arXiv preprint arXiv:2005.03995, 2020

work page arXiv 2005
[23]

Rectified linear units improve restricted Boltzmann machines,

V . Nair and G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines,” inICML, 2010

work page 2010
[24]

Adam: A method for stochastic optimization,

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” inICLR, 2015

work page 2015
[25]

SGDR: Stochastic gradient descent with warm restarts,

I. Loshchilov and F. Hutter, “SGDR: Stochastic gradient descent with warm restarts,” inICLR, 2017

work page 2017
[26]

PyTorch: An imperative style, high- performance deep learning library,

A. Paszke et al., “PyTorch: An imperative style, high- performance deep learning library,” inNeurIPS, 2019, pp. 8024–8035. 8

work page 2019

[1] [1]

Zero-reference deep curve estimation for low- light image enhancement,

C. Guo, C. Li, J. Guo, C. C. Loy, J. Hou, S. Kwong, and R. Cong, “Zero-reference deep curve estimation for low- light image enhancement,” inProc. IEEE/CVF CVPR, 2020, pp. 1780–1789

work page 2020

[2] [2]

Deep residual learn- ing for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learn- ing for image recognition,” inProc. IEEE CVPR, 2016, pp. 770–778

work page 2016

[3] [3]

Benchmarking neural net- work robustness to common corruptions and perturbations,

D. Hendrycks and T. Dietterich, “Benchmarking neural net- work robustness to common corruptions and perturbations,” inICLR, 2019

work page 2019

[4] [4]

Toward fast, flexible, and robust low-light image enhancement,

L. Ma, T. Ma, R. Liu, X. Fan, and Z. Luo, “Toward fast, flexible, and robust low-light image enhancement,” inProc. IEEE/CVF CVPR, 2022, pp. 5637–5646

work page 2022

[5] [5]

Learning multiple layers of features from tiny images,

A. Krizhevsky, “Learning multiple layers of features from tiny images,” Tech. Rep., University of Toronto, 2009

work page 2009

[6] [6]

Deep Retinex decomposition for low-light enhancement,

C. Wei, W. Wang, W. Yang, and J. Liu, “Deep Retinex decomposition for low-light enhancement,” inBMVC, 2018

work page 2018

[7] [7]

Kindling the darkness: A practical low-light image enhancer,

Y . Zhang, J. Zhang, and X. Guo, “Kindling the darkness: A practical low-light image enhancer,” inACM MM, 2019, pp. 1632–1640

work page 2019

[8] [8]

LLNet: A deep autoencoder approach to natural low-light image enhance- ment,

K. G. Lore, A. Akintayo, and S. Sarkar, “LLNet: A deep autoencoder approach to natural low-light image enhance- ment,”Pattern Recognition, vol. 61, pp. 650–662, 2017

work page 2017

[9] [9]

R. C. Gonzalez and R. E. Woods,Digital Image Processing, 4th ed., Pearson, 2018

work page 2018

[10] [10]

Contrast limited adaptive histogram equal- ization,

K. Zuiderveld, “Contrast limited adaptive histogram equal- ization,” inGraphics Gems IV, P. S. Heckbert, Ed., pp. 474– 485, Academic Press, 1994

work page 1994

[11] [11]

Photo- graphic tone reproduction for digital images,

E. Reinhard, M. Stark, P. Shirley, and J. Ferwerda, “Photo- graphic tone reproduction for digital images,”ACM Trans. Graphics, vol. 21, no. 3, pp. 267–276, 2002

work page 2002

[12] [12]

Adap- tive logarithmic mapping for displaying high contrast scenes,

F. Drago, K. Myszkowski, T. Annen, and N. Chiba, “Adap- tive logarithmic mapping for displaying high contrast scenes,”Computer Graphics F orum, vol. 22, no. 3, pp. 419– 426, 2003

work page 2003

[13] [13]

Uncharted 2: HDR lighting,

J. Hable, “Uncharted 2: HDR lighting,” presented at the Game Developers Conference (GDC), San Francisco, CA, 2010

work page 2010

[14] [14]

ACES filmic tone mapping curve,

K. Narkowicz, “ACES filmic tone mapping curve,” 2016, https://knarkowicz.wordpress.com/2016/ 01/06/aces-filmic-tone-mapping-curve/

work page 2016

[15] [15]

TGTM: TinyML-based global tone mapping for HDR sensors,

P. Todorov, J. Hartig, J. Meyer-Siemon, M. Fiedler, and G. Schewior, “TGTM: TinyML-based global tone mapping for HDR sensors,”arXiv preprint arXiv:2405.05016, 2024

work page arXiv 2024

[16] [16]

Learn- ing image-adaptive 3D lookup tables for high performance photo enhancement in real-time,

H. Zeng, J. Cai, L. Li, Z. Cao, and L. Zhang, “Learn- ing image-adaptive 3D lookup tables for high performance photo enhancement in real-time,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 4, pp. 2058–2073, 2022

work page 2058

[17] [17]

Deep bilateral learning for real-time image enhance- ment,

M. Gharbi, J. Chen, J. T. Barron, S. W. Hasinoff, and F. Du- rand, “Deep bilateral learning for real-time image enhance- ment,”ACM Trans. Graphics, vol. 36, no. 4, pp. 118:1– 118:12, 2017. 7

work page 2017

[18] [18]

Learning to see in the dark,

C. Chen, Q. Chen, J. Xu, and V . Koltun, “Learning to see in the dark,” inProc. IEEE/CVF CVPR, 2018, pp. 3291–3300

work page 2018

[19] [19]

Exposure: A white-box photo post-processing framework,

Y . Hu, H. He, C. Xu, B. Wang, and S. Lin, “Exposure: A white-box photo post-processing framework,”ACM Trans. Graphics, vol. 37, no. 2, pp. 26:1–26:17, 2018

work page 2018

[20] [20]

HyperNetworks,

D. Ha, A. M. Dai, and Q. V . Le, “HyperNetworks,” inICLR, 2017

work page 2017

[21] [21]

Learning deep embeddings with histogram loss,

E. Ustinova and V . Lempitsky, “Learning deep embeddings with histogram loss,” inNeurIPS, 2016, pp. 4170–4178

work page 2016

[22] [22]

DeepHist: Differentiable joint pyramidal histogram layer for robust visual recognition,

M. Avi-Aharon, A. Arbelle, and T. R. Raviv, “DeepHist: Differentiable joint pyramidal histogram layer for robust visual recognition,”arXiv preprint arXiv:2005.03995, 2020

work page arXiv 2005

[23] [23]

Rectified linear units improve restricted Boltzmann machines,

V . Nair and G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines,” inICML, 2010

work page 2010

[24] [24]

Adam: A method for stochastic optimization,

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” inICLR, 2015

work page 2015

[25] [25]

SGDR: Stochastic gradient descent with warm restarts,

I. Loshchilov and F. Hutter, “SGDR: Stochastic gradient descent with warm restarts,” inICLR, 2017

work page 2017

[26] [26]

PyTorch: An imperative style, high- performance deep learning library,

A. Paszke et al., “PyTorch: An imperative style, high- performance deep learning library,” inNeurIPS, 2019, pp. 8024–8035. 8

work page 2019