In the proportional high-dimensional regime, stronger backdoor training triggers improve clean accuracy and make attack success non-monotonic for regularized GLMs on Gaussian mixtures, with closed-form proofs for squared loss and fixed-point extensions to convex losses.
hub Tool reference
Learning multiple layers of features from tiny images
Tool reference. 75% of classified Pith citations use this work as a method, library, or software dependency, not as a substantive claim.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Data geometry makes time identifiable from noisy interpolants at rate O(1/sqrt(d-k)), rendering the time-blindness gap asymptotically negligible relative to coupling variance.
ASR applies per-channel variance-matching corrections stabilized by data-driven shrinkage to recover accuracy in highly sparse convolutional networks without retraining.
FeatCal reduces feature drift in merged models via layer-wise closed-form calibration on a small dataset, outperforming prior post-merging methods on CLIP and GLUE benchmarks with high sample efficiency.
Fitting logic gates as 4D multilinear polynomials with covariance Jacobian selection matches or beats 16D softmax baselines on seven datasets and remains stable at 12-layer depth where the baseline drops 37 points on CIFAR-10.
Online kernel regression equals offline regression with shifted targets; correcting the targets lets online learning match offline performance and outperform true targets in continual image classification.
DR-IS selects low-contamination subsets via bounded rank-disagreement in proxy ensembles under an ε-contamination model, with O(√(log(N/δ)/K)) concentration rates that certify separation when the expectation gap Δ' is positive.
GEODE uses per-sample cosine-similarity scaling in a norm loss to preserve feature geometry for universal scorer-compatible OOD detection, matching or exceeding OE performance on CIFAR benchmarks.
WONN is a new oscillatory neural network based on generalized Winfree dynamics that scales competitively to ImageNet-1K and reaches 80.1% accuracy on Maze-hard with 1% of prior model parameters.
EIHF injects high-frequency evidence early to reshape class-conditional feature geometry and reduce ID/OOD Mahalanobis overlap for geometry-sensitive OOD detection.
TIDE is a neuro-inspired architecture using stabilized asymmetric E-I networks with lateral inhibition and 80:20 balance that trains in under half the time of CTM while gaining +1.65% top-1 accuracy on perturbed ImageNet.
MINE uses mechanistic interpretability on language-aligned image representations to generate per-voxel feature descriptions, validated via image generation and counterfactual edits that causally shift brain activation.
Target-aligned data selection via normalized endpoint loss drop on a validation-induced reference path achieves competitive performance with reduced computational overhead.
Contextual Plackett-Luce extends the classical Plackett-Luce model with context-dependent Ising parameterization to enable efficient parallel scoring followed by incremental autoregressive selection for ambiguous sequence tasks.
VISTA adaptively tunes consistency thresholds in decentralized SGD so that the system converges asymptotically like standard SGD even when adversaries dominate the worker pool.
New optimizer uses auxiliary loss to imitate low-order Hessian information, replacing gradient squares in Adam-like training with convergence guarantee and some experimental gains.
Hierarchy-Aware Cross-Entropy improves image classification by incorporating class hierarchies into the loss through prediction aggregation and ancestral label smoothing, achieving mean accuracy gains of 4.66% in end-to-end training and 2.18% in linear probing.
SignMuon merges majority-vote sign aggregation from signSGD with Muon's polar-factor steps to create a communication-efficient distributed optimizer that matches signSGD rates under symmetric noise and shows strong empirical results on CIFAR and nanoGPT.
DR-SNE augments the SNE objective with a density regularization term from normalized log-density estimates to preserve relative densities while retaining neighborhood structure.
LightSplit uses non-invertible orthogonal projections as an information bottleneck in split learning to reduce transmitted dimensionality by 32x while retaining more than 95% accuracy and limiting reconstruction risk.
Standard unlearning metrics disagree in multimodal settings, but a correlation-weighted Unified Quality Score delivers consistent method rankings across benchmarks.
NeuroPlastic is a gradient-based optimizer augmented with a multi-signal plasticity modulation mechanism that improves performance over standard updates on image classification tasks, especially in low-data regimes.
WCR is a new training regularizer that concentrates weight magnitudes onto few parameters to improve one-shot pruning robustness under aggressive sparsity.
FAR substitutes self-attention in pretrained DeiTs with multi-head bidirectional LSTMs via block-wise distillation and structured pruning to enable IMC-compatible inference with comparable accuracy and lower latency.
citing papers explorer
-
When Stronger Triggers Backfire: A High-Dimensional Theory of Backdoor Attacks
In the proportional high-dimensional regime, stronger backdoor training triggers improve clean accuracy and make attack success non-monotonic for regularized GLMs on Gaussian mixtures, with closed-form proofs for squared loss and fixed-point extensions to convex losses.
-
What Time Is It? How Data Geometry Makes Time Conditioning Optional for Flow Matching
Data geometry makes time identifiable from noisy interpolants at rate O(1/sqrt(d-k)), rendering the time-blindness gap asymptotically negligible relative to coupling variance.
-
Adaptive Signal Resuscitation: Channel-wise Post-Pruning Repair for Sparse Vision Networks
ASR applies per-channel variance-matching corrections stabilized by data-driven shrinkage to recover accuracy in highly sparse convolutional networks without retraining.
-
FeatCal: Feature Calibration for Post-Merging Models
FeatCal reduces feature drift in merged models via layer-wise closed-form calibration on a small dataset, outperforming prior post-merging methods on CLIP and GLUE benchmarks with high sample efficiency.
-
Fitting Multilinear Polynomials for Logic Gate Networks
Fitting logic gates as 4D multilinear polynomials with covariance Jacobian selection matches or beats 16D softmax baselines on seven datasets and remains stable at 12-layer depth where the baseline drops 37 points on CIFAR-10.
-
Characterizing and Correcting Effective Target Shift in Online Learning
Online kernel regression equals offline regression with shifted targets; correcting the targets lets online learning match offline performance and outperform true targets in continual image classification.
-
Disagreement-Regularized Importance Sampling for Adversarial Label Corruption
DR-IS selects low-contamination subsets via bounded rank-disagreement in proxy ensembles under an ε-contamination model, with O(√(log(N/δ)/K)) concentration rates that certify separation when the expectation gap Δ' is positive.
-
GEODE: Angle-Adaptive OOD Detection with Universal Scorer Compatibility
GEODE uses per-sample cosine-similarity scaling in a norm loss to preserve feature geometry for universal scorer-compatible OOD detection, matching or exceeding OE performance on CIFAR benchmarks.
-
Winfree Oscillatory Neural Network
WONN is a new oscillatory neural network based on generalized Winfree dynamics that scales competitively to ImageNet-1K and reaches 80.1% accuracy on Maze-hard with 1% of prior model parameters.
-
Early High-Frequency Injection for Geometry-Sensitive OOD Detection
EIHF injects high-frequency evidence early to reshape class-conditional feature geometry and reduce ID/OOD Mahalanobis overlap for geometry-sensitive OOD detection.
-
TIDE: Asymmetric Neural Circuits for Stabilized Temporal Inhibitory-Excitatory Dynamics
TIDE is a neuro-inspired architecture using stabilized asymmetric E-I networks with lateral inhibition and 80:20 balance that trains in under half the time of CTM while gaining +1.65% top-1 accuracy on perturbed ImageNet.
-
Mechanistically Interpretable Neural Encoding Reveals Fine-Grained Functional Selectivity in Human Visual Cortex
MINE uses mechanistic interpretability on language-aligned image representations to generate per-voxel feature descriptions, validated via image generation and counterfactual edits that causally shift brain activation.
-
Let the Target Select for Itself: Data Selection via Target-Aligned Paths
Target-aligned data selection via normalized endpoint loss drop on a validation-induced reference path achieves competitive performance with reduced computational overhead.
-
Contextual Plackett-Luce: An Efficient Neural Model for Probabilistic Sequence Selection under Ambiguity
Contextual Plackett-Luce extends the classical Plackett-Luce model with context-dependent Ising parameterization to enable efficient parallel scoring followed by incremental autoregressive selection for ambiguous sequence tasks.
-
\mathsf{VISTA}: Decentralized Machine Learning in Adversary Dominated Environments
VISTA adaptively tunes consistency thresholds in decentralized SGD so that the system converges asymptotically like standard SGD even when adversaries dominate the worker pool.
-
Low-Order Explicit Hessian Imitation Method for Large-Scale Supervised Machine Learning
New optimizer uses auxiliary loss to imitate low-order Hessian information, replacing gradient squares in Adam-like training with convergence guarantee and some experimental gains.
-
When Labels Have Structure: Improving Image Classification with Hierarchy-Aware Cross-Entropy
Hierarchy-Aware Cross-Entropy improves image classification by incorporating class hierarchies into the loss through prediction aggregation and ancestral label smoothing, achieving mean accuracy gains of 4.66% in end-to-end training and 2.18% in linear probing.
-
SignMuon: Communication-Efficient Distributed Muon Optimization
SignMuon merges majority-vote sign aggregation from signSGD with Muon's polar-factor steps to create a communication-efficient distributed optimizer that matches signSGD rates under symmetric noise and shows strong empirical results on CIFAR and nanoGPT.
-
DR-SNE: Density-Regularized Stochastic Neighbor Embedding
DR-SNE augments the SNE objective with a density regularization term from normalized log-density estimates to preserve relative densities while retaining neighborhood structure.
-
LightSplit: Practical Privacy-Preserving Split Learning via Orthogonal Projections
LightSplit uses non-invertible orthogonal projections as an information bottleneck in split learning to reduce transmitted dimensionality by 32x while retaining more than 95% accuracy and limiting reconstruction risk.
-
Metric Unreliability in Multimodal Machine Unlearning: A Systematic Analysis and Principled Unified Score
Standard unlearning metrics disagree in multimodal settings, but a correlation-weighted Unified Quality Score delivers consistent method rankings across benchmarks.
-
NeuroPlastic: A Plasticity-Modulated Optimizer for Biologically Inspired Learning Dynamics
NeuroPlastic is a gradient-based optimizer augmented with a multi-signal plasticity modulation mechanism that improves performance over standard updates on image classification tasks, especially in low-data regimes.
-
Weight Concentration Regularization for Improving Pruning Robustness Under High Sparsity
WCR is a new training regularizer that concentrates weight magnitudes onto few parameters to improve one-shot pruning robustness under aggressive sparsity.
-
FAR: Function-preserving Attention Replacement for IMC-friendly Inference
FAR substitutes self-attention in pretrained DeiTs with multi-head bidirectional LSTMs via block-wise distillation and structured pruning to enable IMC-compatible inference with comparable accuracy and lower latency.
-
Matched-Learning-Rate Analysis of Attention Drift and Transfer Retention in Fine-Tuned CLIP
Matched learning-rate experiments show LoRA retains substantially higher zero-shot transfer (45% vs 11% on EuroSAT, 58% vs 9% on Pets) than Full FT in CLIP adaptation.