SENTRY: Statistical Reliability Analysis of Vision Transformers Under Soft Errors

Christian Herglotz; Mahdi Taheri; Maksim Jenihhin; Michael Hubner; Pramit Kumar Bhaduri; Samira Nazari

arxiv: 2606.07620 · v1 · pith:ITA7GF5Tnew · submitted 2026-05-30 · 💻 cs.CV · cs.AI· cs.DC· cs.LG

SENTRY: Statistical Reliability Analysis of Vision Transformers Under Soft Errors

Pramit Kumar Bhaduri , Mahdi Taheri , Samira Nazari , Maksim Jenihhin , Christian Herglotz , Michael Hubner This is my paper

Pith reviewed 2026-06-28 18:48 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.DCcs.LG

keywords vision transformerssoft errorsfault injectionstatistical samplingreliability analysisbit-flipsnormalization layersfinite population

0 comments

The pith

Finite-population sampling bounds Vision Transformer soft-error failure rates within a 1% margin at 99% confidence using only a few thousand injections.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a statistical fault injection method for Vision Transformers that treats the space of possible soft errors as a finite population and applies sampling theory to estimate reliability. It shows that this yields formal bounds on failure rates with high confidence while using far fewer tests than exhaustive injection, regardless of model size. The approach matters for safety-critical uses of ViTs because their large parameter counts make full campaigns impractical. The work also maps a non-uniform error landscape in which a small fraction of bit-flips trigger failures that mostly cause total accuracy loss, with the failures concentrated in normalization layers and exponent bits.

Core claim

By modeling soft-error possibilities in ViT parameter storage as a finite population and drawing samples according to sampling theory, failure rates can be bounded within a 1% margin at 99% confidence with a few thousand injections; this holds across model scales and delivers up to a 10,700-fold reduction in experimental cost while still permitting localization of vulnerabilities to specific layers and bit positions.

What carries the argument

Finite-population sampling theory applied to fault-injection campaigns on ViT parameters stored in FP32 format.

If this is right

Reliability statements remain valid when scaling from ViT-Tiny to ViT-Small and beyond.
Only about 3% of FP32 bit-flips produce any failure, yet the great majority of those produce catastrophic accuracy collapse.
Vulnerabilities concentrate in normalization layers and the exponent bits of the IEEE-754 representation.
Designers can target hardening effort at the identified layers and bits rather than the entire model.
The same sampling budget suffices for any model size, removing the need to increase test effort with parameter count.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The sampling bounds could be recomputed on-the-fly during edge deployment to adapt to new input distributions.
Error-correcting codes or selective bit protection could be applied only to the high-impact exponent positions identified by the method.
The framework might extend to other floating-point formats or quantization schemes used in compressed ViTs.

Load-bearing premise

The space of all possible soft errors in the model parameters forms a finite population to which standard sampling formulas apply directly without systematic bias from architecture or input data.

What would settle it

Run an exhaustive fault-injection campaign on ViT-Tiny, compute the true failure rate, and check whether it lies inside the 1% interval predicted by the sampling procedure at 99% confidence.

Figures

Figures reproduced from arXiv: 2606.07620 by Christian Herglotz, Mahdi Taheri, Maksim Jenihhin, Michael Hubner, Pramit Kumar Bhaduri, Samira Nazari.

**Figure 3.** Figure 3: Component vulnerability heatmap for blocks 2–5. Each [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Cross-architecture comparison. (a) Overall failure rates with 99% confidence intervals. The gray band marks the region [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Bit-wise failure rate comparison across architectures. (a) Full scale, showing matching bit-30 spikes at 87.4% and [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

With the growth of Vision Transformers in safety-critical domains like autonomous systems and medical imaging, ensuring their reliability against soft errors is paramount. While ViTs offer state-of-the-art accuracy, their massive parameter counts render exhaustive fault injection campaigns infeasible. To bridge this gap, a statistical fault injection framework is presented, leveraging finite-population sampling theory to provide formal reliability guarantees. It is demonstrated that failure rates are bounded within a 1% margin at 99\% confidence using only a few thousand samples, regardless of model scale. This methodology achieves up to a 10,700 times reduction in experimental cost compared to exhaustive approaches, while preserving the ability to localize vulnerabilities across architectural components. Through extensive evaluation of different architectures like ViT-Tiny and ViT-Small, a highly non-uniform reliability landscape is uncovered. It is shown that while only 3% of FP32 bit-flips result in failure, the vast majority of these events lead to catastrophic accuracy collapse. Specific vulnerabilities are localized to normalization layers and critical exponent bits within the IEEE-754 format, providing a mathematical foundation and actionable insights for the design of hardened, edge-deployed ViT architectures.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Sampling gives cheap 99% bounds on ViT bit-flip failures but the data-dependent definition of failure is the part that needs the most scrutiny.

read the letter

The main takeaway is that the authors apply finite-population sampling to fault injection on Vision Transformers and claim that a few thousand random bit-flips suffice to bound the failure rate within 1% at 99% confidence, independent of model size, while cutting the experimental budget by roughly four orders of magnitude. They also map the failures and show that only about 3% of FP32 flips cause accuracy collapse, yet those that do are usually severe, with clear concentration in normalization layers and exponent bits.

What works is the practical payoff: exhaustive injection is impossible on anything larger than a toy model, so a statistically grounded shortcut that still lets them localize weak spots is useful for anyone hardening ViTs for edge or medical use. The non-uniformity result is concrete and actionable.

The soft spot is exactly the one the stress-test flags. Failure is defined by accuracy drop on a fixed evaluation set, so the proportion being estimated is conditional on that data distribution. If the deployment inputs shift the effective failure labels, the claimed margin no longer holds without additional stratification or re-sampling. The abstract does not show whether they validated the bound against an exhaustive run on a small model or checked sensitivity to the test set. That gap is real but probably fixable with one extra section.

This is the kind of paper that belongs in a reliability or embedded-vision venue. A serious referee should see it because the core idea is sound and the cost reduction is large enough to matter, even if the statistical assumptions need tighter justification. I would send it out rather than desk-reject.

Referee Report

2 major / 1 minor

Summary. The paper presents SENTRY, a statistical fault injection framework for Vision Transformers that applies finite-population sampling theory to bound soft-error failure rates within a 1% margin at 99% confidence using only a few thousand samples, independent of model scale. It claims up to 10,700x experimental cost reduction versus exhaustive injection while still localizing vulnerabilities, reports that only 3% of FP32 bit-flips cause failure (yet most of those are catastrophic), and identifies non-uniform sensitivities concentrated in normalization layers and exponent bits.

Significance. If the sampling bounds hold without data-distribution bias, the work would supply a scalable, formally grounded alternative to exhaustive campaigns for reliability assessment of large ViTs in safety-critical settings and would yield concrete architectural hardening targets.

major comments (2)

[Abstract] Abstract: the claim that standard finite-population sampling delivers unbiased 1% margins at 99% confidence independent of model scale rests on the assumption that each bit-flip has a fixed, architecture- and data-independent failure label; yet failure is defined via accuracy drop on a fixed evaluation set, and the reported non-uniform sensitivities (normalization layers, exponent bits) imply that any shift in input distribution can move the proportion outside the stated bound.
[Abstract] Abstract: no derivation of the finite-population correction, no validation against exhaustive baselines on even a small ViT, and no error-bar methodology are supplied, so the central 1% margin claim at a few thousand samples cannot be verified from the provided text.

minor comments (1)

The abstract states results for ViT-Tiny and ViT-Small but does not reference the corresponding sections or tables that contain the per-component vulnerability statistics.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our statistical fault injection framework. We address each major comment below and will revise the manuscript to improve clarity and completeness.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that standard finite-population sampling delivers unbiased 1% margins at 99% confidence independent of model scale rests on the assumption that each bit-flip has a fixed, architecture- and data-independent failure label; yet failure is defined via accuracy drop on a fixed evaluation set, and the reported non-uniform sensitivities (normalization layers, exponent bits) imply that any shift in input distribution can move the proportion outside the stated bound.

Authors: The finite-population sampling bounds the failure proportion p for the specific labeling induced by the fixed evaluation set and model under test; this is the standard setup for fault-injection studies. The independence from model scale refers to the number of samples required to achieve the stated margin (via the finite-population correction), not to invariance of p itself. We acknowledge that a change in input distribution could alter p and will add an explicit limitations paragraph clarifying that the reported bounds apply to the evaluation distribution used in the experiments. revision: yes
Referee: [Abstract] Abstract: no derivation of the finite-population correction, no validation against exhaustive baselines on even a small ViT, and no error-bar methodology are supplied, so the central 1% margin claim at a few thousand samples cannot be verified from the provided text.

Authors: We agree that the main text should contain an explicit derivation of the finite-population correction, a validation experiment comparing sampling bounds against exhaustive injection on ViT-Tiny, and a description of the error-bar construction. These elements will be added to the revised manuscript (and the abstract updated to reference them). revision: yes

Circularity Check

0 steps flagged

No circularity; applies external finite-population sampling theory to fault population

full rationale

The derivation applies standard sampling theory (finite-population correction for proportion estimator and confidence bounds) to a defined population of bit-flips. Failure labels are obtained by direct injection and evaluation on a fixed dataset, but the reported bounds and cost-reduction factors are produced by the external statistical formulas rather than by any quantity fitted from the ViT outputs or by self-citation. No self-definitional, fitted-input-renamed-as-prediction, or load-bearing self-citation steps are present. The central claim therefore remains independent of the target ViT data.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Central claim rests on direct applicability of finite-population sampling theory to the discrete space of IEEE-754 bit-flips in ViT weights; no free parameters or invented entities are introduced in the abstract.

axioms (1)

standard math Finite-population sampling theory supplies formal reliability bounds when applied to the complete set of possible soft errors in model parameters
Invoked to justify 99% confidence bounds from few-thousand samples regardless of model scale.

pith-pipeline@v0.9.1-grok · 5763 in / 1174 out tokens · 21695 ms · 2026-06-28T18:48:42.733917+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

21 extracted references · 4 canonical work pages · 1 internal anchor

[1]

Phd thesis summary: Methods for reliability assessment and enhancement of deep neural network hardware accelerators,

M. Taheri, “Phd thesis summary: Methods for reliability assessment and enhancement of deep neural network hardware accelerators,”arXiv preprint arXiv:2603.08724, 2026

work page arXiv 2026
[2]

Adap- tive fault resilience for early-exit dnns,

R. M. Kodamanchili, N. Cherezova, M. Taheri, and M. Jenihhin, “Adap- tive fault resilience for early-exit dnns,” in2025 IEEE International Test Conference in Asia (ITC-Asia). IEEE, 2025, pp. 108–113

2025
[3]

Mix-and-match pruning: Globally guided layer-wise sparsification of dnns,

D. Monachan, S. Nazari, M. Taheri, A. Azarpeyvand, M. Krstic, M. Huebner, and C. Herglotz, “Mix-and-match pruning: Globally guided layer-wise sparsification of dnns,”arXiv preprint arXiv:2603.20280, 2026

work page arXiv 2026
[4]

Resq: A unified frame- work for reliability-and security enhancement of quantized deep neural networks,

A. S. Mohammadi, S. Nazari, A. Azarpeyvand, M. Taheri, M. Krstic, M. H ¨ubner, C. Herglotz, and T. Ghasempouri, “Resq: A unified frame- work for reliability-and security enhancement of quantized deep neural networks,” in2026 IEEE 27th Latin American Test Symposium (LATS). IEEE, 2026, pp. 1–4

2026
[5]

An image is worth 16×16 words: Trans- formers for image recognition at scale,

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16×16 words: Trans- formers for image recognition at scale,” inInternational Conference on Learning Representations (ICLR), 2021

2021
[6]

Reliability-aware hyperparameter optimization for ann- to-snn conversion,

S. Sharifian, M. Taheri, V . Rashtchi, A. Azarpeyvand, C. Herglotz, and M. Jenihhin, “Reliability-aware hyperparameter optimization for ann- to-snn conversion,”WiPiEC Journal-Works in Progress in Embedded Computing Journal, vol. 11, no. 1, pp. 7–7, 2025

2025
[7]

Genie: Genetic algorithm-based reliability assessment methodology for deep neural networks,

S. Nazari, M. Taheri, A. Azarpeyvand, M. Afsharchi, C. Herglotz, and M. Jenihhin, “Genie: Genetic algorithm-based reliability assessment methodology for deep neural networks,” in2025 11th International Conference on Computing and Artificial Intelligence (ICCAI). IEEE, 2025, pp. 264–271

2025
[8]

Reliability-aware performance optimization of dnn hw acceler- ators through heterogeneous quantization,

——, “Reliability-aware performance optimization of dnn hw acceler- ators through heterogeneous quantization,” in2025 IEEE 26th Latin American Test Symposium (LATS). IEEE, 2025, pp. 1–6

2025
[9]

Fortune: A negative memory overhead hardware-agnostic fault tolerance technique in dnns,

S. Nazari, M. Taheri, A. Azarpeyvand, M. Afsharchi, T. Ghasempouri, C. Herglotz, M. Daneshtalab, and M. Jenihhin, “Fortune: A negative memory overhead hardware-agnostic fault tolerance technique in dnns,” Authorea Preprints, 2024

2024
[10]

Noise-tolerance gpu-based age estimation using resnet-50,

M. Taheri, M. Taheri, and A. Hadjahmadi, “Noise-tolerance gpu-based age estimation using resnet-50,”arXiv preprint arXiv:2305.00848, 2023

work page arXiv 2023
[11]

Ares: A framework for quantifying the resilience of deep neural networks,

B. Reagen, U. Gupta, L. Pentecost, P. Whatmough, S. K. Lee, N. Mulhol- land, D. Brooks, and G.-Y . Wei, “Ares: A framework for quantifying the resilience of deep neural networks,” inACM/IEEE Design Automation Conference (DAC), 2018, pp. 1–6

2018
[12]

Statistical fault injection: Quantified error and confidence,

R. Leveugle, A. Calvez, P. Maistri, and P. Vanhauwaert, “Statistical fault injection: Quantified error and confidence,” inDesign, Automation and Test in Europe Conference (DATE), 2009, pp. 502–506, tIMA Laboratory, Grenoble INP, UJF, CNRS

2009
[13]

An effective iterative statistical fault injection methodology for deep neural networks,

A. Ruospo, M. Sonza Reorda, R. Mariani, and E. Sanchez, “An effective iterative statistical fault injection methodology for deep neural networks,”IEEE Transactions on Computers, vol. 74, pp. 2431–2444, 2025

2025
[14]

Gaussian Error Linear Units (GELUs)

D. Hendrycks and K. Gimpel, “Gaussian error linear units (GELUs),” arXiv preprint arXiv:1606.08415, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[15]

Radiation-induced soft errors in advanced semiconductor technologies,

R. Baumann, “Radiation-induced soft errors in advanced semiconductor technologies,”IEEE Transactions on Device and Materials Reliability, vol. 5, no. 3, pp. 305–316, 2005

2005
[16]

Soft error reliability analysis of vision transformers,

X. Xue, C. Liu, Y . Wang, B. Yang, T. Luo, L. Zhang, H. Li, and X. Li, “Soft error reliability analysis of vision transformers,”IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 31, no. 12, pp. 2126–2136, 2023

2023
[17]

Fine-grained fault sensitivity analysis of vision transformers under soft errors,

J. He, Y . Liu, C. Xu, X. Liao, and Y . Yang, “Fine-grained fault sensitivity analysis of vision transformers under soft errors,”Electronics, vol. 14, p. 2418, 2025

2025
[18]

Analyzing and enhancing the reliability of vision transformer models against soft errors,

E.-Y . Liao and T.-C. Wang, “Analyzing and enhancing the reliability of vision transformer models against soft errors,” inIEEE International Symposium on Circuits and Systems (ISCAS), 2025, pp. 1–5

2025
[19]

EuroSAT: A novel dataset and deep learning benchmark for land use and land cover classi- fication,

P. Helber, B. Bischke, A. Dengel, and D. Borth, “EuroSAT: A novel dataset and deep learning benchmark for land use and land cover classi- fication,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 12, no. 7, pp. 2217–2226, 2019

2019
[20]

Learning multiple layers of features from tiny images,

A. Krizhevsky, “Learning multiple layers of features from tiny images,” University of Toronto, Tech. Rep., 2009. [Online]. Available: https://www.cs.toronto.edu/∼kriz/learning-features-2009-TR.pdf

2009
[21]

PyTorch Image Models,

R. Wightman, “PyTorch Image Models,” GitHub, 2019. [Online]. Available: https://github.com/huggingface/pytorch-image-models

2019

[1] [1]

Phd thesis summary: Methods for reliability assessment and enhancement of deep neural network hardware accelerators,

M. Taheri, “Phd thesis summary: Methods for reliability assessment and enhancement of deep neural network hardware accelerators,”arXiv preprint arXiv:2603.08724, 2026

work page arXiv 2026

[2] [2]

Adap- tive fault resilience for early-exit dnns,

R. M. Kodamanchili, N. Cherezova, M. Taheri, and M. Jenihhin, “Adap- tive fault resilience for early-exit dnns,” in2025 IEEE International Test Conference in Asia (ITC-Asia). IEEE, 2025, pp. 108–113

2025

[3] [3]

Mix-and-match pruning: Globally guided layer-wise sparsification of dnns,

D. Monachan, S. Nazari, M. Taheri, A. Azarpeyvand, M. Krstic, M. Huebner, and C. Herglotz, “Mix-and-match pruning: Globally guided layer-wise sparsification of dnns,”arXiv preprint arXiv:2603.20280, 2026

work page arXiv 2026

[4] [4]

Resq: A unified frame- work for reliability-and security enhancement of quantized deep neural networks,

A. S. Mohammadi, S. Nazari, A. Azarpeyvand, M. Taheri, M. Krstic, M. H ¨ubner, C. Herglotz, and T. Ghasempouri, “Resq: A unified frame- work for reliability-and security enhancement of quantized deep neural networks,” in2026 IEEE 27th Latin American Test Symposium (LATS). IEEE, 2026, pp. 1–4

2026

[5] [5]

An image is worth 16×16 words: Trans- formers for image recognition at scale,

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16×16 words: Trans- formers for image recognition at scale,” inInternational Conference on Learning Representations (ICLR), 2021

2021

[6] [6]

Reliability-aware hyperparameter optimization for ann- to-snn conversion,

S. Sharifian, M. Taheri, V . Rashtchi, A. Azarpeyvand, C. Herglotz, and M. Jenihhin, “Reliability-aware hyperparameter optimization for ann- to-snn conversion,”WiPiEC Journal-Works in Progress in Embedded Computing Journal, vol. 11, no. 1, pp. 7–7, 2025

2025

[7] [7]

Genie: Genetic algorithm-based reliability assessment methodology for deep neural networks,

S. Nazari, M. Taheri, A. Azarpeyvand, M. Afsharchi, C. Herglotz, and M. Jenihhin, “Genie: Genetic algorithm-based reliability assessment methodology for deep neural networks,” in2025 11th International Conference on Computing and Artificial Intelligence (ICCAI). IEEE, 2025, pp. 264–271

2025

[8] [8]

Reliability-aware performance optimization of dnn hw acceler- ators through heterogeneous quantization,

——, “Reliability-aware performance optimization of dnn hw acceler- ators through heterogeneous quantization,” in2025 IEEE 26th Latin American Test Symposium (LATS). IEEE, 2025, pp. 1–6

2025

[9] [9]

Fortune: A negative memory overhead hardware-agnostic fault tolerance technique in dnns,

S. Nazari, M. Taheri, A. Azarpeyvand, M. Afsharchi, T. Ghasempouri, C. Herglotz, M. Daneshtalab, and M. Jenihhin, “Fortune: A negative memory overhead hardware-agnostic fault tolerance technique in dnns,” Authorea Preprints, 2024

2024

[10] [10]

Noise-tolerance gpu-based age estimation using resnet-50,

M. Taheri, M. Taheri, and A. Hadjahmadi, “Noise-tolerance gpu-based age estimation using resnet-50,”arXiv preprint arXiv:2305.00848, 2023

work page arXiv 2023

[11] [11]

Ares: A framework for quantifying the resilience of deep neural networks,

B. Reagen, U. Gupta, L. Pentecost, P. Whatmough, S. K. Lee, N. Mulhol- land, D. Brooks, and G.-Y . Wei, “Ares: A framework for quantifying the resilience of deep neural networks,” inACM/IEEE Design Automation Conference (DAC), 2018, pp. 1–6

2018

[12] [12]

Statistical fault injection: Quantified error and confidence,

R. Leveugle, A. Calvez, P. Maistri, and P. Vanhauwaert, “Statistical fault injection: Quantified error and confidence,” inDesign, Automation and Test in Europe Conference (DATE), 2009, pp. 502–506, tIMA Laboratory, Grenoble INP, UJF, CNRS

2009

[13] [13]

An effective iterative statistical fault injection methodology for deep neural networks,

A. Ruospo, M. Sonza Reorda, R. Mariani, and E. Sanchez, “An effective iterative statistical fault injection methodology for deep neural networks,”IEEE Transactions on Computers, vol. 74, pp. 2431–2444, 2025

2025

[14] [14]

Gaussian Error Linear Units (GELUs)

D. Hendrycks and K. Gimpel, “Gaussian error linear units (GELUs),” arXiv preprint arXiv:1606.08415, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[15] [15]

Radiation-induced soft errors in advanced semiconductor technologies,

R. Baumann, “Radiation-induced soft errors in advanced semiconductor technologies,”IEEE Transactions on Device and Materials Reliability, vol. 5, no. 3, pp. 305–316, 2005

2005

[16] [16]

Soft error reliability analysis of vision transformers,

X. Xue, C. Liu, Y . Wang, B. Yang, T. Luo, L. Zhang, H. Li, and X. Li, “Soft error reliability analysis of vision transformers,”IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 31, no. 12, pp. 2126–2136, 2023

2023

[17] [17]

Fine-grained fault sensitivity analysis of vision transformers under soft errors,

J. He, Y . Liu, C. Xu, X. Liao, and Y . Yang, “Fine-grained fault sensitivity analysis of vision transformers under soft errors,”Electronics, vol. 14, p. 2418, 2025

2025

[18] [18]

Analyzing and enhancing the reliability of vision transformer models against soft errors,

E.-Y . Liao and T.-C. Wang, “Analyzing and enhancing the reliability of vision transformer models against soft errors,” inIEEE International Symposium on Circuits and Systems (ISCAS), 2025, pp. 1–5

2025

[19] [19]

EuroSAT: A novel dataset and deep learning benchmark for land use and land cover classi- fication,

P. Helber, B. Bischke, A. Dengel, and D. Borth, “EuroSAT: A novel dataset and deep learning benchmark for land use and land cover classi- fication,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 12, no. 7, pp. 2217–2226, 2019

2019

[20] [20]

Learning multiple layers of features from tiny images,

A. Krizhevsky, “Learning multiple layers of features from tiny images,” University of Toronto, Tech. Rep., 2009. [Online]. Available: https://www.cs.toronto.edu/∼kriz/learning-features-2009-TR.pdf

2009

[21] [21]

PyTorch Image Models,

R. Wightman, “PyTorch Image Models,” GitHub, 2019. [Online]. Available: https://github.com/huggingface/pytorch-image-models

2019