Dataset distillation creates a tiny synthetic training set that, when used with a fixed network initialization, produces models whose performance approximates that of models trained on the full original dataset.
hub
Striving for Simplicity: The All Convolutional Net
20 Pith papers cite this work. Polarity classification is still indexing.
abstract
Most modern convolutional neural networks (CNNs) used for object recognition are built using the same principles: Alternating convolution and max-pooling layers followed by a small number of fully connected layers. We re-evaluate the state of the art for object recognition from small images with convolutional networks, questioning the necessity of different components in the pipeline. We find that max-pooling can simply be replaced by a convolutional layer with increased stride without loss in accuracy on several image recognition benchmarks. Following this finding -- and building on other recent work for finding simple network structures -- we propose a new architecture that consists solely of convolutional layers and yields competitive or state of the art performance on several object recognition datasets (CIFAR-10, CIFAR-100, ImageNet). To analyze the network we introduce a new variant of the "deconvolution approach" for visualizing features learned by CNNs, which can be applied to a broader range of network structures than existing approaches.
hub tools
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
Structured updates (low-rank or masked) and sketched updates (quantized, rotated, subsampled) reduce uplink communication in federated learning by up to two orders of magnitude on convolutional and recurrent networks.
DCGANs with architectural constraints learn a hierarchy of representations from object parts to scenes in both generator and discriminator across image datasets.
InChIfied Invariants based on InChI achieve 99.62% identical representations for chemically equivalent molecular graphs versus 0.35% for standard Daylight invariants on one million PubChem molecules, while preserving predictive performance and enforcing consistent attributions.
Spectral Integrated Gradients constructs SVD-based integration paths that activate singular components from largest to smallest, producing cleaner attribution maps and better quantitative scores than standard Integrated Gradients on image classification tasks.
Latte performs seed-centered one-step latent mutations along class anchors in VQ-VAE space to produce diverse, low-drift, fault-revealing DNN tests.
ExPath is a subgraph inference framework that classifies bio-networks with experimental data and uses explanations to identify targeted pathways, reporting up to 4.5x higher Fidelity+ and 14x lower Fidelity- than baselines on 301 networks.
SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.
Saliency-driven interpretation methods reveal that NMT models learn word alignments of better quality than fast-align under force decoding and consistent with automatic tools under free decoding.
Geometric deep learning provides a unified mathematical framework based on grids, groups, graphs, geodesics, and gauges to explain and extend neural network architectures by incorporating physical regularities.
Convolutional sparse autoencoder on two-channel sEMG delivers 94.3% multi-subject F1 for six gestures, 92.3% after few-shot transfer to unseen subjects, and 90% after incremental extension to ten classes.
AdaProb performs machine unlearning by substituting final-layer output probabilities with optimized uniform pseudo-probabilities and updating model weights.
Empirical comparison shows gradient-based explanations for GNN node similarities are actionable, consistent, and retain effects when sparsified, unlike mutual information explanations.
Benchmark study of ten GNN explainers on eight architectures and six datasets that isolates usable components and issues practical recommendations.
Methods are introduced to lift static attribution techniques to dynamical models for explaining risk increases in clinical alert systems.
NRM enables OoD detection by joint latent likelihood, assigning lower values to SVHN than CIFAR-10 (unlike VAEs/flows) and consistent across other OoD sets.
A cycle-consistent GAN generates counterfactual medical images to attribute classification decisions more comprehensively than standard saliency methods.
Overparameterized DNNs enable more effective machine unlearning for privacy and bias removal via localized decision-region adjustments, with performance depending on method access to forgotten data.
Hierarchical multigraph GCNs applied to superpixels achieve competitive or superior accuracy to CNNs on standard image classification benchmarks.
The paper delivers a mechanism-centric taxonomy and unified perspective on explainable human activity recognition methods across sensing modalities.
citing papers explorer
-
Explaining the Explainers in Graph Neural Networks: a Comparative Study
Benchmark study of ten GNN explainers on eight architectures and six datasets that isolates usable components and issues practical recommendations.