hub Tool reference

Learning multiple layers of features from tiny images

Alex Krizhevsky · 2009

Tool reference. 75% of classified Pith citations use this work as a method, library, or software dependency, not as a substantive claim.

25 Pith papers citing it

Method reference 75% of classified citations

browse 25 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

dataset 5 background 2 method 1

citation-polarity summary

use dataset 5 background 2 use method 1

representative citing papers

When Stronger Triggers Backfire: A High-Dimensional Theory of Backdoor Attacks

cs.LG · 2026-05-21 · unverdicted · novelty 8.0

In the proportional high-dimensional regime, stronger backdoor training triggers improve clean accuracy and make attack success non-monotonic for regularized GLMs on Gaussian mixtures, with closed-form proofs for squared loss and fixed-point extensions to convex losses.

What Time Is It? How Data Geometry Makes Time Conditioning Optional for Flow Matching

cs.LG · 2026-05-08 · unverdicted · novelty 8.0

Data geometry makes time identifiable from noisy interpolants at rate O(1/sqrt(d-k)), rendering the time-blindness gap asymptotically negligible relative to coupling variance.

Adaptive Signal Resuscitation: Channel-wise Post-Pruning Repair for Sparse Vision Networks

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

ASR applies per-channel variance-matching corrections stabilized by data-driven shrinkage to recover accuracy in highly sparse convolutional networks without retraining.

FeatCal: Feature Calibration for Post-Merging Models

cs.LG · 2026-05-13 · conditional · novelty 7.0

FeatCal reduces feature drift in merged models via layer-wise closed-form calibration on a small dataset, outperforming prior post-merging methods on CLIP and GLUE benchmarks with high sample efficiency.

Fitting Multilinear Polynomials for Logic Gate Networks

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

Fitting logic gates as 4D multilinear polynomials with covariance Jacobian selection matches or beats 16D softmax baselines on seven datasets and remains stable at 12-layer depth where the baseline drops 37 points on CIFAR-10.

Characterizing and Correcting Effective Target Shift in Online Learning

stat.ML · 2026-05-08 · unverdicted · novelty 7.0

Online kernel regression equals offline regression with shifted targets; correcting the targets lets online learning match offline performance and outperform true targets in continual image classification.

Disagreement-Regularized Importance Sampling for Adversarial Label Corruption

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

DR-IS selects low-contamination subsets via bounded rank-disagreement in proxy ensembles under an ε-contamination model, with O(√(log(N/δ)/K)) concentration rates that certify separation when the expectation gap Δ' is positive.

GEODE: Angle-Adaptive OOD Detection with Universal Scorer Compatibility

cs.LG · 2026-05-01 · unverdicted · novelty 7.0

GEODE uses per-sample cosine-similarity scaling in a norm loss to preserve feature geometry for universal scorer-compatible OOD detection, matching or exceeding OE performance on CIFAR benchmarks.

Winfree Oscillatory Neural Network

cs.LG · 2026-05-20 · unverdicted · novelty 6.0

WONN is a new oscillatory neural network based on generalized Winfree dynamics that scales competitively to ImageNet-1K and reaches 80.1% accuracy on Maze-hard with 1% of prior model parameters.

Early High-Frequency Injection for Geometry-Sensitive OOD Detection

cs.CV · 2026-05-20 · conditional · novelty 6.0

EIHF injects high-frequency evidence early to reshape class-conditional feature geometry and reduce ID/OOD Mahalanobis overlap for geometry-sensitive OOD detection.

TIDE: Asymmetric Neural Circuits for Stabilized Temporal Inhibitory-Excitatory Dynamics

cs.LG · 2026-05-19 · unverdicted · novelty 6.0

TIDE is a neuro-inspired architecture using stabilized asymmetric E-I networks with lateral inhibition and 80:20 balance that trains in under half the time of CTM while gaining +1.65% top-1 accuracy on perturbed ImageNet.

Mechanistically Interpretable Neural Encoding Reveals Fine-Grained Functional Selectivity in Human Visual Cortex

cs.CV · 2026-05-15 · unverdicted · novelty 6.0

MINE uses mechanistic interpretability on language-aligned image representations to generate per-voxel feature descriptions, validated via image generation and counterfactual edits that causally shift brain activation.

Let the Target Select for Itself: Data Selection via Target-Aligned Paths

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

Target-aligned data selection via normalized endpoint loss drop on a validation-induced reference path achieves competitive performance with reduced computational overhead.

Contextual Plackett-Luce: An Efficient Neural Model for Probabilistic Sequence Selection under Ambiguity

cs.LG · 2026-05-09 · unverdicted · novelty 6.0

Contextual Plackett-Luce extends the classical Plackett-Luce model with context-dependent Ising parameterization to enable efficient parallel scoring followed by incremental autoregressive selection for ambiguous sequence tasks.

\mathsf{VISTA}: Decentralized Machine Learning in Adversary Dominated Environments

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

VISTA adaptively tunes consistency thresholds in decentralized SGD so that the system converges asymptotically like standard SGD even when adversaries dominate the worker pool.

Low-Order Explicit Hessian Imitation Method for Large-Scale Supervised Machine Learning

math.OC · 2026-05-07 · unverdicted · novelty 6.0

New optimizer uses auxiliary loss to imitate low-order Hessian information, replacing gradient squares in Adam-like training with convergence guarantee and some experimental gains.

When Labels Have Structure: Improving Image Classification with Hierarchy-Aware Cross-Entropy

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

Hierarchy-Aware Cross-Entropy improves image classification by incorporating class hierarchies into the loss through prediction aggregation and ancestral label smoothing, achieving mean accuracy gains of 4.66% in end-to-end training and 2.18% in linear probing.

SignMuon: Communication-Efficient Distributed Muon Optimization

cs.LG · 2026-05-04 · unverdicted · novelty 6.0

SignMuon merges majority-vote sign aggregation from signSGD with Muon's polar-factor steps to create a communication-efficient distributed optimizer that matches signSGD rates under symmetric noise and shows strong empirical results on CIFAR and nanoGPT.

DR-SNE: Density-Regularized Stochastic Neighbor Embedding

cs.LG · 2026-05-03 · unverdicted · novelty 6.0

DR-SNE augments the SNE objective with a density regularization term from normalized log-density estimates to preserve relative densities while retaining neighborhood structure.

LightSplit: Practical Privacy-Preserving Split Learning via Orthogonal Projections

cs.LG · 2026-05-13 · unverdicted · novelty 5.0

LightSplit uses non-invertible orthogonal projections as an information bottleneck in split learning to reduce transmitted dimensionality by 32x while retaining more than 95% accuracy and limiting reconstruction risk.

Metric Unreliability in Multimodal Machine Unlearning: A Systematic Analysis and Principled Unified Score

cs.CV · 2026-05-04 · unverdicted · novelty 5.0 · 2 refs

Standard unlearning metrics disagree in multimodal settings, but a correlation-weighted Unified Quality Score delivers consistent method rankings across benchmarks.

NeuroPlastic: A Plasticity-Modulated Optimizer for Biologically Inspired Learning Dynamics

cs.LG · 2026-04-29 · unverdicted · novelty 5.0

NeuroPlastic is a gradient-based optimizer augmented with a multi-signal plasticity modulation mechanism that improves performance over standard updates on image classification tasks, especially in low-data regimes.

Weight Concentration Regularization for Improving Pruning Robustness Under High Sparsity

cs.LG · 2025-11-18 · unverdicted · novelty 5.0

WCR is a new training regularizer that concentrates weight magnitudes onto few parameters to improve one-shot pruning robustness under aggressive sparsity.

FAR: Function-preserving Attention Replacement for IMC-friendly Inference

cs.CV · 2025-05-24 · unverdicted · novelty 5.0

FAR substitutes self-attention in pretrained DeiTs with multi-head bidirectional LSTMs via block-wise distillation and structured pruning to enable IMC-compatible inference with comparable accuracy and lower latency.

citing papers explorer

Showing 25 of 25 citing papers.

When Stronger Triggers Backfire: A High-Dimensional Theory of Backdoor Attacks cs.LG · 2026-05-21 · unverdicted · none · ref 20
In the proportional high-dimensional regime, stronger backdoor training triggers improve clean accuracy and make attack success non-monotonic for regularized GLMs on Gaussian mixtures, with closed-form proofs for squared loss and fixed-point extensions to convex losses.
What Time Is It? How Data Geometry Makes Time Conditioning Optional for Flow Matching cs.LG · 2026-05-08 · unverdicted · none · ref 17
Data geometry makes time identifiable from noisy interpolants at rate O(1/sqrt(d-k)), rendering the time-blindness gap asymptotically negligible relative to coupling variance.
Adaptive Signal Resuscitation: Channel-wise Post-Pruning Repair for Sparse Vision Networks cs.LG · 2026-05-20 · unverdicted · none · ref 22
ASR applies per-channel variance-matching corrections stabilized by data-driven shrinkage to recover accuracy in highly sparse convolutional networks without retraining.
FeatCal: Feature Calibration for Post-Merging Models cs.LG · 2026-05-13 · conditional · none · ref 38
FeatCal reduces feature drift in merged models via layer-wise closed-form calibration on a small dataset, outperforming prior post-merging methods on CLIP and GLUE benchmarks with high sample efficiency.
Fitting Multilinear Polynomials for Logic Gate Networks cs.LG · 2026-05-09 · unverdicted · none · ref 31
Fitting logic gates as 4D multilinear polynomials with covariance Jacobian selection matches or beats 16D softmax baselines on seven datasets and remains stable at 12-layer depth where the baseline drops 37 points on CIFAR-10.
Characterizing and Correcting Effective Target Shift in Online Learning stat.ML · 2026-05-08 · unverdicted · none · ref 36
Online kernel regression equals offline regression with shifted targets; correcting the targets lets online learning match offline performance and outperform true targets in continual image classification.
Disagreement-Regularized Importance Sampling for Adversarial Label Corruption cs.LG · 2026-05-08 · unverdicted · none · ref 18
DR-IS selects low-contamination subsets via bounded rank-disagreement in proxy ensembles under an ε-contamination model, with O(√(log(N/δ)/K)) concentration rates that certify separation when the expectation gap Δ' is positive.
GEODE: Angle-Adaptive OOD Detection with Universal Scorer Compatibility cs.LG · 2026-05-01 · unverdicted · none · ref 46
GEODE uses per-sample cosine-similarity scaling in a norm loss to preserve feature geometry for universal scorer-compatible OOD detection, matching or exceeding OE performance on CIFAR benchmarks.
Winfree Oscillatory Neural Network cs.LG · 2026-05-20 · unverdicted · none · ref 19
WONN is a new oscillatory neural network based on generalized Winfree dynamics that scales competitively to ImageNet-1K and reaches 80.1% accuracy on Maze-hard with 1% of prior model parameters.
Early High-Frequency Injection for Geometry-Sensitive OOD Detection cs.CV · 2026-05-20 · conditional · none · ref 24
EIHF injects high-frequency evidence early to reshape class-conditional feature geometry and reduce ID/OOD Mahalanobis overlap for geometry-sensitive OOD detection.
TIDE: Asymmetric Neural Circuits for Stabilized Temporal Inhibitory-Excitatory Dynamics cs.LG · 2026-05-19 · unverdicted · none · ref 47
TIDE is a neuro-inspired architecture using stabilized asymmetric E-I networks with lateral inhibition and 80:20 balance that trains in under half the time of CTM while gaining +1.65% top-1 accuracy on perturbed ImageNet.
Mechanistically Interpretable Neural Encoding Reveals Fine-Grained Functional Selectivity in Human Visual Cortex cs.CV · 2026-05-15 · unverdicted · none · ref 76
MINE uses mechanistic interpretability on language-aligned image representations to generate per-voxel feature descriptions, validated via image generation and counterfactual edits that causally shift brain activation.
Let the Target Select for Itself: Data Selection via Target-Aligned Paths cs.LG · 2026-05-10 · unverdicted · none · ref 25
Target-aligned data selection via normalized endpoint loss drop on a validation-induced reference path achieves competitive performance with reduced computational overhead.
Contextual Plackett-Luce: An Efficient Neural Model for Probabilistic Sequence Selection under Ambiguity cs.LG · 2026-05-09 · unverdicted · none · ref 19
Contextual Plackett-Luce extends the classical Plackett-Luce model with context-dependent Ising parameterization to enable efficient parallel scoring followed by incremental autoregressive selection for ambiguous sequence tasks.
\mathsf{VISTA}: Decentralized Machine Learning in Adversary Dominated Environments cs.LG · 2026-05-08 · unverdicted · none · ref 62
VISTA adaptively tunes consistency thresholds in decentralized SGD so that the system converges asymptotically like standard SGD even when adversaries dominate the worker pool.
Low-Order Explicit Hessian Imitation Method for Large-Scale Supervised Machine Learning math.OC · 2026-05-07 · unverdicted · none · ref 7
New optimizer uses auxiliary loss to imitate low-order Hessian information, replacing gradient squares in Adam-like training with convergence guarantee and some experimental gains.
When Labels Have Structure: Improving Image Classification with Hierarchy-Aware Cross-Entropy cs.LG · 2026-05-07 · unverdicted · none · ref 11
Hierarchy-Aware Cross-Entropy improves image classification by incorporating class hierarchies into the loss through prediction aggregation and ancestral label smoothing, achieving mean accuracy gains of 4.66% in end-to-end training and 2.18% in linear probing.
SignMuon: Communication-Efficient Distributed Muon Optimization cs.LG · 2026-05-04 · unverdicted · none · ref 14
SignMuon merges majority-vote sign aggregation from signSGD with Muon's polar-factor steps to create a communication-efficient distributed optimizer that matches signSGD rates under symmetric noise and shows strong empirical results on CIFAR and nanoGPT.
DR-SNE: Density-Regularized Stochastic Neighbor Embedding cs.LG · 2026-05-03 · unverdicted · none · ref 28
DR-SNE augments the SNE objective with a density regularization term from normalized log-density estimates to preserve relative densities while retaining neighborhood structure.
LightSplit: Practical Privacy-Preserving Split Learning via Orthogonal Projections cs.LG · 2026-05-13 · unverdicted · none · ref 27
LightSplit uses non-invertible orthogonal projections as an information bottleneck in split learning to reduce transmitted dimensionality by 32x while retaining more than 95% accuracy and limiting reconstruction risk.
Metric Unreliability in Multimodal Machine Unlearning: A Systematic Analysis and Principled Unified Score cs.CV · 2026-05-04 · unverdicted · none · ref 30 · 2 links
Standard unlearning metrics disagree in multimodal settings, but a correlation-weighted Unified Quality Score delivers consistent method rankings across benchmarks.
NeuroPlastic: A Plasticity-Modulated Optimizer for Biologically Inspired Learning Dynamics cs.LG · 2026-04-29 · unverdicted · none · ref 17
NeuroPlastic is a gradient-based optimizer augmented with a multi-signal plasticity modulation mechanism that improves performance over standard updates on image classification tasks, especially in low-data regimes.
Weight Concentration Regularization for Improving Pruning Robustness Under High Sparsity cs.LG · 2025-11-18 · unverdicted · none · ref 23
WCR is a new training regularizer that concentrates weight magnitudes onto few parameters to improve one-shot pruning robustness under aggressive sparsity.
FAR: Function-preserving Attention Replacement for IMC-friendly Inference cs.CV · 2025-05-24 · unverdicted · none · ref 33
FAR substitutes self-attention in pretrained DeiTs with multi-head bidirectional LSTMs via block-wise distillation and structured pruning to enable IMC-compatible inference with comparable accuracy and lower latency.
Matched-Learning-Rate Analysis of Attention Drift and Transfer Retention in Fine-Tuned CLIP cs.LG · 2026-04-01 · unverdicted · none · ref 8
Matched learning-rate experiments show LoRA retains substantially higher zero-shot transfer (45% vs 11% on EuroSAT, 58% vs 9% on Pets) than Full FT in CLIP adaptation.

Learning multiple layers of features from tiny images

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer