Predicting Neural Network Accuracy from Weights
read the original abstract
We show experimentally that the accuracy of a trained neural network can be predicted surprisingly well by looking only at its weights, without evaluating it on input data. We motivate this task and introduce a formal setting for it. Even when using simple statistics of the weights, the predictors are able to rank neural networks by their performance with very high accuracy (R2 score more than 0.98). Furthermore, the predictors are able to rank networks trained on different, unobserved datasets and with different architectures. We release a collection of 120k convolutional neural networks trained on four different datasets to encourage further research in this area, with the goal of understanding network training and performance better.
This paper has not been read by Pith yet.
Forward citations
Cited by 6 Pith papers
-
Beyond Structural Symmetries: Linear Mode Connectivity via Neuron Identifiability
Neural networks admit large families of approximately equivalent solutions via neuron identifiability even without structural symmetry, enabling linear low-loss merging paths without prior alignment.
-
ModelLens: Finding the Best for Your Task from Myriads of Models
ModelLens learns a performance-aware latent space from 1.62M leaderboard records to rank unseen models on unseen datasets without forward passes on the target.
-
Evaluation without Generation: Non-Generative Assessment of Harmful Model Specialization with Applications to CSAM
Gaussian probing infers harmful model specialization from parameter perturbations and internal representation responses to Gaussian latent ensembles rather than from generated outputs.
-
Dynamic Neural Graph Encoding of Inference Processes in Deep Weight Space
DNG-Encoder represents NN weights as dynamic graphs to preserve sequential inference and powers INR2JLS, which raises INR classification accuracy by ~10% on CIFAR-100-INR.
-
What Linear Probes Miss: Multi-View Probing for Weight-Space Learning
MVProbe is a multi-perspective probing framework for weight-space learning that combines first-order and Gram-based views and outperforms ProbeX on the Model Jungle benchmark.
-
Towards Learning Representations of Policies in Two-Player Zero-Sum Imperfect-Information Games
Basic dataset creation, embedding learning, and evaluation tasks on Kuhn and Leduc Poker demonstrate that useful behavioral representations appear in the learned embeddings.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.