Towards Task and Architecture-Independent Generalization Gap Predictors

Hanna Mazzawi; Javier Gonzalvo; Scott Yak

arxiv: 1906.01550 · v1 · pith:DNGA7CSQnew · submitted 2019-06-04 · 📊 stat.ML · cs.LG

Towards Task and Architecture-Independent Generalization Gap Predictors

Scott Yak , Javier Gonzalvo , Hanna Mazzawi This is my paper

classification 📊 stat.ML cs.LG

keywords architecture-independentdifferentgeneralizationrnnsdatasetdeepdnnslearning

0 comments

read the original abstract

Can we use deep learning to predict when deep learning works? Our results suggest the affirmative. We created a dataset by training 13,500 neural networks with different architectures, on different variations of spiral datasets, and using different optimization parameters. We used this dataset to train task-independent and architecture-independent generalization gap predictors for those neural networks. We extend Jiang et al. (2018) to also use DNNs and RNNs and show that they outperform the linear model, obtaining $R^2=0.965$. We also show results for architecture-independent, task-independent, and out-of-distribution generalization gap prediction tasks. Both DNNs and RNNs consistently and significantly outperform linear models, with RNNs obtaining $R^2=0.584$.

This paper has not been read by Pith yet.

Towards Task and Architecture-Independent Generalization Gap Predictors

discussion (0)