Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, Li Fei-Fei · 2009

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

representative citing papers

Conceptualizing Embeddings: Sparse Disentanglement for Vision-Language Models

cs.CV · 2026-05-21 · unverdicted · novelty 7.0

CEDAR learns an invertible rotation of vision-language embeddings to concentrate semantics into sparse, axis-aligned coordinates for improved interpretability.

Provable Robustness against Backdoor Attacks via the Primal-Dual Perspective on Differential Privacy

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

A new framework is introduced for end-to-end provable robustness against backdoor attacks by composing randomized smoothing with differentially private training via privacy profiles.

Does Engram Do Memory Retrieval in Autoregressive Image Generation?

cs.CV · 2026-05-13 · accept · novelty 7.0

Engram in AR image generation saves backbone FLOPs but trails pure AR baselines in FID and behaves as a gated side-pathway rather than a content-addressed retriever.

SSL4RL: Revisiting Self-supervised Learning as Intrinsic Reward for Visual-Language Reasoning

cs.CV · 2025-10-18 · unverdicted · novelty 6.0

SSL4RL reformulates self-supervised learning objectives into dense, verifiable reward signals for RL-based fine-tuning of vision-language models, yielding performance gains on reasoning benchmarks.

Improved Mean Flows: On the Challenges of Fastforward Generative Models

cs.CV · 2025-12-01 · unverdicted · novelty 5.0

Improved MeanFlow (iMF) reaches 1.72 FID on ImageNet 256x256 with one function evaluation by reformulating the training objective as a regression on instantaneous velocity and treating guidance as flexible conditioning variables.

citing papers explorer

Showing 5 of 5 citing papers.

Conceptualizing Embeddings: Sparse Disentanglement for Vision-Language Models cs.CV · 2026-05-21 · unverdicted · none · ref 28
CEDAR learns an invertible rotation of vision-language embeddings to concentrate semantics into sparse, axis-aligned coordinates for improved interpretability.
Provable Robustness against Backdoor Attacks via the Primal-Dual Perspective on Differential Privacy cs.LG · 2026-05-20 · unverdicted · none · ref 11
A new framework is introduced for end-to-end provable robustness against backdoor attacks by composing randomized smoothing with differentially private training via privacy profiles.
Does Engram Do Memory Retrieval in Autoregressive Image Generation? cs.CV · 2026-05-13 · accept · none · ref 3
Engram in AR image generation saves backbone FLOPs but trails pure AR baselines in FID and behaves as a gated side-pathway rather than a content-addressed retriever.
SSL4RL: Revisiting Self-supervised Learning as Intrinsic Reward for Visual-Language Reasoning cs.CV · 2025-10-18 · unverdicted · none · ref 13
SSL4RL reformulates self-supervised learning objectives into dense, verifiable reward signals for RL-based fine-tuning of vision-language models, yielding performance gains on reasoning benchmarks.
Improved Mean Flows: On the Challenges of Fastforward Generative Models cs.CV · 2025-12-01 · unverdicted · none · ref 8
Improved MeanFlow (iMF) reaches 1.72 FID on ImageNet 256x256 with one function evaluation by reformulating the training objective as a regression on instantaneous velocity and treating guidance as flexible conditioning variables.

Imagenet: A large-scale hierarchical image database

fields

years

verdicts

representative citing papers

citing papers explorer