Learning Curves for Mutual Information Maximization
classification
❄️ cond-mat.dis-nn
keywords
learninginformationmutualcurvesdatamaximizationmodelnetworks
read the original abstract
An unsupervised learning procedure based on maximizing the mutual information between the outputs of two networks receiving different but statistically dependent inputs is analyzed (Becker and Hinton, Nature, 355, 92, 161). For a generic data model, I show that in the large sample limit the structure in the data is recognized by mutual information maximization. For a more restricted model, where the networks are similar to perceptrons, I calculate the learning curves for zero-temperature Gibbs learning. These show that convergence can be rather slow, and a way of regularizing the procedure is considered.
This paper has not been read by Pith yet.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.