hub

A downsampled variant of imagenet as an alternative to the CIFAR datasets

Patryk Chrabaszcz, Ilya Loshchilov, Frank Hutter · 2017 · cs.CV · arXiv 1707.08819

15 Pith papers cite this work. Polarity classification is still indexing.

15 Pith papers citing it

open full Pith review browse 15 citing papers arXiv PDF

abstract

The original ImageNet dataset is a popular large-scale benchmark for training Deep Neural Networks. Since the cost of performing experiments (e.g, algorithm design, architecture search, and hyperparameter tuning) on the original dataset might be prohibitive, we propose to consider a downsampled version of ImageNet. In contrast to the CIFAR datasets and earlier downsampled versions of ImageNet, our proposed ImageNet32$\times$32 (and its variants ImageNet64$\times$64 and ImageNet16$\times$16) contains exactly the same number of classes and images as ImageNet, with the only difference that the images are downsampled to 32$\times$32 pixels per image (64$\times$64 and 16$\times$16 pixels for the variants, respectively). Experiments on these downsampled variants are dramatically faster than on the original ImageNet and the characteristics of the downsampled datasets with respect to optimal hyperparameters appear to remain similar. The proposed datasets and scripts to reproduce our results are available at http://image-net.org/download-images and https://github.com/PatrykChrabaszcz/Imagenet32_Scripts

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

dataset 2 background 1 other 1

citation-polarity summary

use dataset 2 background 1 unclear 1

representative citing papers

Building Normalizing Flows with Stochastic Interpolants

cs.LG · 2022-09-30 · conditional · novelty 8.0

Normalizing flows are constructed by learning the velocity of a stochastic interpolant via a quadratic loss derived from its probability current, yielding an efficient ODE-based alternative to diffusion models.

StAD: Stein Amortized Divergence for Fast Likelihoods with Diffusion and Flow

stat.ML · 2026-05-15 · unverdicted · novelty 7.0

StAD distills divergence of PF-ODEs via the Langevin-Stein operator for faster, lower-variance likelihood estimation in generative models without Jacobian costs.

Continual Learning for VLMs: A Survey and Taxonomy Beyond Forgetting

cs.CV · 2025-08-06 · unverdicted · novelty 7.0

The paper offers a comprehensive survey and proposes a new taxonomy for continual learning strategies in VLMs and MLLMs to combat catastrophic forgetting beyond traditional methods.

Zero-Shot Neural Network Evaluation with Sample-Wise Activation Patterns

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

SWAP-Score evaluates neural networks without training by quantifying sample-wise activation patterns, achieving high correlation with true performance on CIFAR-10 for CNNs and GLUE for Transformers while enabling fast NAS.

Scaling Laws for Autoregressive Generative Modeling

cs.LG · 2020-10-28 · accept · novelty 7.0

Autoregressive transformers follow power-law scaling laws for cross-entropy loss with nearly universal exponents relating optimal model size to compute budget across four domains.

Structuring Open-Ended NAS: Semi-Automated Design Knowledge Structuring with LLMs for Efficient Neural Architecture Search

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

Authors structure architectural design knowledge with LLMs to create an open-ended NAS space and introduce FairNAD, which finds architectures improving 0.84, 2.17, and 2.35 points over SOTA on CIFAR-10, CIFAR-100, and ImageNet16-120.

Scaling Laws for Transfer

cs.LG · 2021-02-02 · unverdicted · novelty 6.0

Effective data transferred from pre-training to fine-tuning is described by a power law in model parameter count and fine-tuning dataset size, acting like a multiplier on the fine-tuning data.

LLM as a Tool, Not an Agent: Code-Mined Tree Transformations for Neural Architecture Search

cs.LG · 2026-04-17 · unverdicted · novelty 6.0

LLMasTool improves neural architecture search by evolving code-mined hierarchical trees with diversity-guided Bayesian planning and targeted LLM assistance, reporting gains of 0.69, 1.83, and 2.68 points on CIFAR-10, CIFAR-100, and ImageNet16-120.

SubFLOT: Submodel Extraction for Efficient and Personalized Federated Learning via Optimal Transport

cs.LG · 2026-04-08 · unverdicted · novelty 6.0

SubFLOT uses optimal transport to generate data-aware personalized submodels via server-side pruning and scaling-based adaptive regularization to mitigate parametric divergence in heterogeneous federated learning.

Language Models (Mostly) Know What They Know

cs.CL · 2022-07-11 · unverdicted · novelty 6.0

Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.

A General Language Assistant as a Laboratory for Alignment

cs.CL · 2021-12-01 · conditional · novelty 6.0

Ranked preference modeling outperforms imitation learning for language model alignment and scales more favorably with model size.

Taming the Long Tail: Rebalancing Adversarial Training via Adaptive Perturbation

cs.LG · 2026-05-13 · unverdicted · novelty 5.0

RobustLT adaptively adjusts perturbations in adversarial training to simultaneously improve robustness and class balance on long-tailed datasets.

Deterministic Decomposition of Stochastic Generative Dynamics

cs.LG · 2026-05-09 · unverdicted · novelty 5.0 · 2 refs

Stochastic generative dynamics are decomposed into transport and osmotic parts via b_t = u_t + d_t, with Bridge Matching proposed to learn the components for controllable sampling.

Elucidating the SNR-t Bias of Diffusion Probabilistic Models

cs.CV · 2026-04-17 · unverdicted · novelty 4.0

Diffusion models have an SNR-timestep mismatch during inference that the authors mitigate with per-frequency differential correction, raising generation quality across IDDPM, ADM, DDIM and others.

Coreset-Induced Conditional Velocity Flow Matching

stat.ML · 2026-05-13

citing papers explorer

Showing 15 of 15 citing papers.

Building Normalizing Flows with Stochastic Interpolants cs.LG · 2022-09-30 · conditional · none · ref 112
Normalizing flows are constructed by learning the velocity of a stochastic interpolant via a quadratic loss derived from its probability current, yielding an efficient ODE-based alternative to diffusion models.
StAD: Stein Amortized Divergence for Fast Likelihoods with Diffusion and Flow stat.ML · 2026-05-15 · unverdicted · none · ref 4 · internal anchor
StAD distills divergence of PF-ODEs via the Langevin-Stein operator for faster, lower-variance likelihood estimation in generative models without Jacobian costs.
Continual Learning for VLMs: A Survey and Taxonomy Beyond Forgetting cs.CV · 2025-08-06 · unverdicted · none · ref 122 · internal anchor
The paper offers a comprehensive survey and proposes a new taxonomy for continual learning strategies in VLMs and MLLMs to combat catastrophic forgetting beyond traditional methods.
Zero-Shot Neural Network Evaluation with Sample-Wise Activation Patterns cs.LG · 2026-05-08 · unverdicted · none · ref 56
SWAP-Score evaluates neural networks without training by quantifying sample-wise activation patterns, achieving high correlation with true performance on CIFAR-10 for CNNs and GLUE for Transformers while enabling fast NAS.
Scaling Laws for Autoregressive Generative Modeling cs.LG · 2020-10-28 · accept · none · ref 3
Autoregressive transformers follow power-law scaling laws for cross-entropy loss with nearly universal exponents relating optimal model size to compute budget across four domains.
Structuring Open-Ended NAS: Semi-Automated Design Knowledge Structuring with LLMs for Efficient Neural Architecture Search cs.CV · 2026-05-19 · unverdicted · none · ref 11 · internal anchor
Authors structure architectural design knowledge with LLMs to create an open-ended NAS space and introduce FairNAD, which finds architectures improving 0.84, 2.17, and 2.35 points over SOTA on CIFAR-10, CIFAR-100, and ImageNet16-120.
Scaling Laws for Transfer cs.LG · 2021-02-02 · unverdicted · none · ref 37 · internal anchor
Effective data transferred from pre-training to fine-tuning is described by a power law in model parameter count and fine-tuning dataset size, acting like a multiplier on the fine-tuning data.
LLM as a Tool, Not an Agent: Code-Mined Tree Transformations for Neural Architecture Search cs.LG · 2026-04-17 · unverdicted · none · ref 14
LLMasTool improves neural architecture search by evolving code-mined hierarchical trees with diversity-guided Bayesian planning and targeted LLM assistance, reporting gains of 0.69, 1.83, and 2.68 points on CIFAR-10, CIFAR-100, and ImageNet16-120.
SubFLOT: Submodel Extraction for Efficient and Personalized Federated Learning via Optimal Transport cs.LG · 2026-04-08 · unverdicted · none · ref 7
SubFLOT uses optimal transport to generate data-aware personalized submodels via server-side pruning and scaling-based adaptive regularization to mitigate parametric divergence in heterogeneous federated learning.
Language Models (Mostly) Know What They Know cs.CL · 2022-07-11 · unverdicted · none · ref 123
Language models show good calibration when asked to estimate the probability that their own answers are correct, with performance improving as models get larger.
A General Language Assistant as a Laboratory for Alignment cs.CL · 2021-12-01 · conditional · none · ref 65
Ranked preference modeling outperforms imitation learning for language model alignment and scales more favorably with model size.
Taming the Long Tail: Rebalancing Adversarial Training via Adaptive Perturbation cs.LG · 2026-05-13 · unverdicted · none · ref 8 · internal anchor
RobustLT adaptively adjusts perturbations in adversarial training to simultaneously improve robustness and class balance on long-tailed datasets.
Deterministic Decomposition of Stochastic Generative Dynamics cs.LG · 2026-05-09 · unverdicted · none · ref 1 · 2 links · internal anchor
Stochastic generative dynamics are decomposed into transport and osmotic parts via b_t = u_t + d_t, with Bridge Matching proposed to learn the components for controllable sampling.
Elucidating the SNR-t Bias of Diffusion Probabilistic Models cs.CV · 2026-04-17 · unverdicted · none · ref 8
Diffusion models have an SNR-timestep mismatch during inference that the authors mitigate with per-frequency differential correction, raising generation quality across IDDPM, ADM, DDIM and others.
Coreset-Induced Conditional Velocity Flow Matching stat.ML · 2026-05-13 · unreviewed · ref 5 · internal anchor

A downsampled variant of imagenet as an alternative to the CIFAR datasets

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer